JUL 20 1937 


AMERICAN 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


E. T. BELL ABRAHAM COHEN 
(CALIFORNIA INSTITUTE OF TECHNOLOGY - THE JOHNS HOPKINS UNIVERSITY 


E. W. CHITTENDEN F. D. MURNAGHAN 
UNIVERSITY OF IOWA THE JOHNS HOPKINS UNIVERSITY 


J. F. RITT 
COLUMBIA UNIVERSITY 


WITH THE COOPERATION OF 


FRANK MORLEY MARSTON MORSE G. C. EVANS 

W. A. MANNING E. P. LANE AUREL WINTNER 
HARRY BATEMAN ALONZO CHURCH GABRIEL SZEGO 
HARRY LEVY L. R. FORD R. L. WILDER 

J. R. KLINE OSCAR ZARISKI R. D. JAMES 


PUBLISHED UNDER THE JOINT AUSPICES OF 


THE JOHNS HOPKINS UNIVERSITY 
AND 


THE AMERICAN MATHEMATICAL SOCIETY 


Volume LIX, Number 3 
JULY, 1937 


THE JOHNS HOPKINS PRESS 
BALTIMOR MARYLAND 
Ws. A. 


| 
| 
| 
| 
| 
- 
tig? 
j 
le 
it 
2; 
| 
| 
j 


CONTENTS 
PAGE 4 
On the representations of the symmetric group. By F. D. MurnaGHAN, 437 1 
The Heaviside operational calculus. By D. G. Bourcin and R.J.Durrin, 489 ] 
Note on formal logic. By M. H. Stones, . 
A ease of coloration in the four color problem. By C. E. Winn, . ; 


On the fundamental group of a certain class of plane algebraic curves. 
By W. 8. Turpin, ‘ 


Geometry of turbines, flat fields, and differential equations. By Epwarp 
KasNner and JOHN De Cicco, 


Parallelism and equidistance of congruences of curves of orthogonal 
ennuples. By R. M. PErErs, . ‘ 


On the non-alternating images of linear graphs. By Dick Wick Hatt, 


Representations in certain pure forms of degrees higher than the second. 
By E. T. Brew, . ‘ ‘ 


On the normal forms of linear canonical transformations in dynamics. 
By JoHN WILLIAMSON, . ‘ ‘ 


Criteria for certain higher congruences. By LEONARD CARLITZ, . : 
On a trigonometrical series of Riemann. By AUREL WINTNER, . , 


On divergent infinite convolutions. By E. R. van KAmMPEN and AUREL 


An analogue of Jacobi’s condition for the problem of Mayer with variable 
end points. By THomas FREEMAN CoPE, 


On the asymptotic distribution of ¢’/£(s) in the critical sie By 
RicHARD KERSHNER and AUREL WINTNER, : 


On the addition of convex curves and the densities of certain infinite 
convolutions. By E. R. van KAMPEN, . ° 


On the partial sums of certain Fourier series. By Orro SzAsz, 


THe AMERICAN JOURNAL OF MATHEMATICS will appear four times yearly. 4 
The subscription price of the JourgNAL for the current volume is $7.50 (foreign | 
postage 50 cents); single numbers $2.00. P 
A few complete sets of the JoURNAL remain on sale, a 
Papers intended for publication in the JOURNAL may be sent to any of the Editors. 
Editorial communications may be sent to Dr. A. Coen at The Johns Hopkins § 
University. a 
Subscriptions to the JourNaL and all business communications should be sent to 
Tur Jouns Hopkins Press, BALTIMORE, MARYLAND, U.S. A. 


Entered as second-class matter at the Baltimore, Maryland, Postoffice, acceptance for mailing at special b 
rate of postage provided for in Section 1103, Act of October 8, 1917, Authorized on July 8, 1918. . 


PRINTED IN THE UNITED STATES OF AMERICA 
BY J. H. FURST COMPANY, BALTIMORE, MARYLAND 


618 
629 | 


4 
; 
a 
3 
a 
} 
a 
P. 
4 


4 
q 
4 
4 
& 
a 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP.* 


By F. D. MurNAGHAN. 


Introduction. The representation theory of the symmetric group (group 
of n! permutations of n letters) was initiated by Frobenius some forty 
years ago and was developed in the, now classical, papers of Schur and Young. 
More recently Littlewood and Richardson (18) have discussed in detail the 
problem of the construction of the character table and have used a recurrence 
formula (passing from the symmetric group on (n—1) letters to the sym- 
metric group on n letters) due to Schur in order to determine the characters 
of those classes of permutations which contain at least one unary cycle (= fixed 
letter). We show in the present paper that this recurrence formula of Schur 
is but a special case of a general recurrence formula by means of which the 
characters of a class containing at least one cycle on p letters (lS p< n) 
may be determined from the characters of the symmetric group on n— p 
letters. As the characters of the class containing just one cycle (on n letters) 
are trivially evident (as was pointed out by Frobenius) the construction of the 
character tables for the various symmetric groups (n=—1,2,3,:--) is a 
routine matter demanding only paper and ink; and the easiest characters to 
calculate are those of classes containing cycles on the greatest number of letters. 

The representation theory of the symmetric group is of importance in 
nuclear physics and in this connection the following two questions are of 


particular significance. 


1. If we imagine the n letters, whose permutations constitute the sym- 
metric group, to be divided up in compartments or boxes containing, respec- 
tively, Ax letters (so that A, +A. we obtain a 


"Received May 31, 1937. This paper is an elaboration of an address delivered 
April 12, 1937, at the Institute for Advanced Study during the author’s stay there as 
guest member 1936-37. 


i 
i 
iD 
8 
if 
| 
i 


438 F. D. MURNAGHAN. 


subgroup of the symmetric group by considering those permutations which do 
not send any letter out of its box. The cosets (right or left) of this subgroup 
furnish a representation (whose elements are permutation matrices) of the 
symmetric group; this representation is, in general, reducible and it is im- 
portant to determine its analysis into irreducible components. The solution 
of this question is quite simple and is well known when k = 2, i. e. when there 
are but two boxes. We give the solution in the general case. 


2. The “direct product” of an irreducible representation of the sym- 
metric group on n letters by an irreducible representation of the symmetric 
group on m letters furnishes a representation, in general reducible, of the 
symmetric group on n + m letters and it is important to determine the analysis 
of this reducible representation into its irreducible components. We show how 
to do this, without having to use the character tables, and record the results 
for all values of n and m for which n+mS 9. ; 

In the hope of making the theory of the representations of the symmetric 
group more accessible to workers in nuclear physics we have made the following 
account somewhat self-contained. The original papers of Frobenius, and 
particularly those of Schur, arouse in a persevering reader an emotion akin to 
that inspired by one of the great symphonies; but they are by no means easy 
reading and we hope that a somewhat elementary orchestration may acquaint 
a larger audience with the work of the masters. It is a pleasure to here record 
our obligation, amongst others, to Professor Wedderburn for a pregnant remark 
which materially aided and simplified our treatment of the problem 1 of the 


preceding paragraph. 


1. The characteristics of a finite group. Let G bea finite group of order 
N;; its elements will fall into r classes (of conjugate elements) C1,- - -, Cr 
such that if gr CC, each element of C, is of the form ggrg?, gC G. We 
denote by N, the number of elements in the p-th class Cp so that, C, being the 
class consisting of the identity element g,, NN, —1andN,+ N.+-:-+N,=N. 
By a representation of G is meant a linear group homomorphic to @ and it is 
well known that G possesses exactly r non-equivalent irreducible representations 
T,,: -°,I,. These are distinguished from one another by their characters and 
we denote by x p% the character of IT, associated with the class Cz; i.¢ 
xo" = xp(G9a) where gg@ Cy. These characters satisfy certain fundamental 
orthogonality relations which are most conveniently stated as follows. If 
a(g), b(g) are any two complex valued functions defined over G we denote 
by (a:b) the average of the product a(g)b(g) over G (the superposed bar 
denoting, as usual, the complex conjugate) : 


I 
( 
t 
t 
\ 
Ci 
Pp 
si 
a 
st 
ge 
se 
80 
yl 
re 
Te 
se 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 439 


(ab) = 


We shall be interested only in the case where a(g) and b(g) are class functions : 


a(gq) = 4%; b( gq) = 0%; gqgC Cy and then (a:b) = Then the 
q 


orthogonality relations referred to are 
(1) (Ap %a) = 9; PAY; (hp = 1; (p= 


These imply that any class function a is a linear combination of the r functions 
4p: a4 = CZ, (usual summation convention) where c? = (a-%,). Denoting 
by s a class function whose r components (q=1,- -,71), are indetermi- 


nates the expression = (8° %p) = Nox%s? is called the characteristic 
q 


of the irreducible representation Ty and the characters yp? are called the com- 
ponents of this characteristic. Any representation of G is of the form c°T, 
where the coefficients c? are integers (positive or zero) and the characters of 
this representation are c*yg?; the corresponding expression c%$qa(s) being 
termed the characteristic of the given representation (with components c*yq?). 
When all the coefficients c? vanish save one which is unity, so that the repre- 
sentation is irreducible, the characteristic is termed simple; otherwise it is 
called compound so that the characteristic of a reducible representation is com- 
pound. It is occasionally convenient to allow the coefficients c? in the expres- 
sion ¢*a(s) to take negative integral as well as positive integral or zero values 
and then c%_(s) is termed a generalized characteristic, it being clearly under- 
stood that if any of the coefficients c? are negative the components c*y? of the 
generalised characteristic are not the characters of any representation. 

If (s-c%%,) is a generalised characteristic with components c*y_? = a? we 


see at once from the orthogonality relations (1) that (c*ya*c*xa) = D (c*)? 
a=1 

so that (a: a) —1 implies all the coefficients c? zero save one which = + 1. 

The generalised characteristic will, therefore, be simple if (a-a) —1 and if 


a', the coefficient of a > 0 (for the coefficient of s' in a simple characteristic 


yields, on multiplication by N, the dimension of the corresponding irreducible 
representation of G, and is, accordingly, positive). Amongst the irreducible 
tepresentations (T,,: --,I,) of any finite group occurs the identity repre- 
sentation T, (in which to each element g CG there corresponds the one 


r 


440 F. D. MURNAGHAN. 


dimensional unit matrix) and the associated simple characteristic is called the 
principal characteristic of G; its explicit expression is ¢i(s) = Nest 
q=1 


that the coefficient of s% in ¢,(s) yields, on multiplication by N, the order of G, 
the number of elements in the class Cy. Finally the orthogonality relations 
(1) express the fact that the numbers = Nq/N are the elements 
of an r X r unitary matrix; so that 


(2) = 05 Pp; 2 N/Ne. 


Hence the equations ¢)(s) = ¥ > Noxp’s? may be solved for the indeterminates 
q 


s? the solution being 


Before passing to our subject proper, the symmetric group, it is necessary 
to say a few words concerning a basic theorem of Frobenius which enables us 
to derive from a characteristic of a subgroup H of G a characteristic of G itself. 
Let H be of order M and denote by h a typical element of H; the class of H 
to which h belongs is a subset, proper or not, of the class of G to which h 
belongs. But a class C; of G may contain several classes of H or none at all; 
we say that H refines the classes of G. If I is any representation of G it 
induces a representation I- of H where I~ consists of those linear operators 
of T which remain after the operators which correspond to elements of G which 
are not in H are rejected. If, in particular, is an irreducible representation 
of G I~ will be, in general, a reducible representation of H(since it may be 
possible to find a proper, non-trivial subspace of the carrier space of I which 
is invariant under all the operators of I~ although the irreducibility of I 
guarantees that no such subspace exists which is invariant under all the opera- 
tors of T). If we have any class function a(g) defined over G it induces by 
the process of projection: a*(h) —a(h) a class function a*(h) defined over H. 


Since a(q) -> (a@-%a)xa(g) we have a*(h) — %a)x0(h) or, equiva- 
lently, a* = >» (a*%a)%*a. In particular when a is the indeterminate s whose 
components a appeared in the definition of the simple characteristics of ¢ 
we have — (8) where the numbers x*,/ are the characters of 


representation (in general reducible) of H the index j running over the classes 


t 
r 
t 
be 
8) 
t 
of 
eC 
I 
si 
W 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 441 


of H. If these number ¢ and if the characters of the irreducible representations 
of H be denoted by &/ =1,- -,t), we may write = cp%G, where the 
coefficients are positive integers. The expression (s* - d*(s* - 
where the coefficients d/ are integers, positive, negative or zero, is a generalised 
characteristic of H, it being clearly understood that the indeterminates s*(h) 
are conditioned by the fact that they are the same for all elements of H which 
lie in the same class of G (and not merely the same for all elements of H which 
lie in the same class of H). On substituting for s* its expression given above 
this generalised characteristic of H appears in the form 


owing to the orthogonality relations amongst the characters of the irreducible 
t 

representations of H. Since > (ca*d8) is an integer, positive, negative, or 


zero, it follows that any generalised characteristic of H furnishes by the pro- 
cedure outlined a generalised characteristic of G (of which the components 
corresponding to classes of G which contain no elements of H are zero). Asa 
trivial instance of this theorem let G be the symmetric group on 2 letters and 
H the identity element. The principal characteristic of H is s* and this being a 
generalised characteristic of G its components are (2, 0) since s'== $( 2s + 0-s?). 
The generalised characteristic of G obtained in this way contains the principal 
characteristic ¢,(s) = 4(s' + s?) once since c' = $(2.1-+ 0.1) —1 and the 
remaining characteristic is $(s‘'—s*). This characteristic is simple since 
${(1)? + (—1)?} —1 and, in addition, the coefficient of s* is positive. Thus 
the two simple characteristics of the symmetric group on two letters are 
¢:(s) =43(s'-+ s*) and d2(s) =4(s'—s*) the corresponding characters 
being (1,1) and (1,—1) respectively. As a less trivial example let G be the 
symmetric group on 3 letters; N =—6, N, = 1, N2—=3, Nz; =2 and let H be 
the symmetric group on two of the three letters. The principal characteristic 
of H is $(s1-+ s?) and writing this in the form }(3s' + 3s? + 0-s*) its 
components (when viewed as a generalised characteristic of @) are (3,1,0). 
It contains the principal characteristic ¢,(s) = 4(s' + 3s? + 28°) of G@ once 
since c! = 4(1.3.1 + 3.1.1 + 2.0.1) =1 and the remaining characteristic is 
+ 0.s? 2.s*) with components (2,0,—1). This characteristic is 
simple since 4(2? + 0? + 2.(—1)*) —1 and the coefficient of s* is positive. 
We have, then, the two simple characteristics ¢,(s) = 4(s! + 3s? + 2s°) ; 
$2(s) = 4(2s' + 0.s? — 2s*) of the symmetric group on 8 letters. Starting 


442 F. D. MURNAGHAN. 


with the characteristic 4(st — s*) of H we find ct —0, c? —1 the remaining 
characteristic ¢3(8) = 4(s! — 3s? + 2s*) being simple. ¢1:(8), $2(8), $3(8) 
are the three simple characteristics of the symmetric group on 3 letters the corre- 
sponding characters being (1,1,1), (2,0,—1), and (1,—1,1) respectively. 

We obtain a representation, in general reducible, of G in the following 
manner. Any subgroup H, of order M, of G has d=WN/M right cosets 
A, =H; -, Ha = and if we define (H’;,- - -, H’a) by the 
equation H’, = Hyg where g is an arbitrary element of G the symbols H’ con- 
stitute a permutation of the symbols H; i.e. H’ = P(g)H where P(g) is a 
permutation matrix. The matrices P(g) furnish a representation of G; if g, 
is any member of the class Cz of G (q =1,-'- -,17), the character x? associated 
with the class C, in this representation is the number of ones in the diagonal 
of the permutation matrix P(gq) i.e. the number of elements g, for which 
Hy9q= H,. In other words x? is the number of elements gp for which 
9n9a9v © H, or, since this number is the same for each gg C Cg, x4 = (number 
of times gpCggp" lies in H) + Ng. But gpCqgpt = Cy, so that as p runs from 
1 to d we obtain C,:d = N/M times. Hence x4 = N/M times the number of 
elements of HC C,—Ng. This suffices to show that the generalised char- 
acteristic of G obtained from the principal characteristic of H by the method 
of the previous paragraph is really a compound (or simple) characteristic; 
in fact the characteristic of that representation (by permutation matrices) 
which is furnished by the cosets of H in G. For the principal characteristic 
of H is 1/M }\s(h) ; written as a generalised characteristic of G it appears as 

h 


1/N > Ns(h)/M so that its components c*yg? are N/M times the number of 
h 


elements of H in Cg Ny. These being precisely the characters of the repre- 
sentation referred to, the theorem stated follows since two representations with 
the same characters or characteristic, are equivalent. 


2. The principal and alternating characteristics of the symmetric 
group. Construction of reducible representations. Hach permutation of the 
symmetric group on n letters may be written in a unique manner as a product 
of cycles, no letter appearing in more than one cycle. Two permutation: ‘ith 
the same cycle structure, i.e. containing the same number @, of cycles on one 
letter (— unary cycles), the same number, 2 of cycles on two letters (= binary 
cycles), the same number, a, of ternary cycles and so on, belong to the same 
class. For example if n=5 and P = (12) (345), Q = (23) (154) the per- 


— (123) (45) transforms P into We 


mutation 


| 

h 
B 
se 
se 
th 
th 
(3 
di: 
of 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 443 


refer to the class with the cycle structure (@) = (@,° - +, Qn) as the class («) 
and observe that a, + 2a,-+.:---+ na,—vn so that the number of classes, 
and hence of irreducible representations, is the number of solutions of this 
equation in non-negative integers. Writing 


it is clear that 

If is such that Ay > 0, Ace = = An = 0 we have 


so that the number of classes, or of irreducible representations, is the same as 
the number of partitions of n into sums of positive integers. We shall indicate 
a partition by the symbol (Ai,A2,° *,Ax) Where the parts A,,° are 
written in non-increasing order and shall use an obvious exponent notation for 
the sake of brevity when two or more parts are equal. E. g. (3, 2?, 1°) denotes 
the partition (3, 2,2,1,1,1) of 10: 3+2+2+4+1+1+1==10. Each 
partition is conveniently indicated by a diagram of horizontal rows of dots all 
beginning on the same vertical line; thus (3, 2?, 1°) is indicated by the diagram 


By a simple interchange of the rows and columns of a diagram we obtain a 
second diagram (termed the associate of the original diagram) and hence a 
second partition of m (termed the associate of the original partition). E. g., 
the associate of (3, 27,1%) is (6,3,1). When the associate is identical with 
the original the diagram (and partition) are termed self-associated. E. g., 
(3,2, 1) is a self-associated partition of 6. We shall see how to attach to each 
diagram, or partition of n, a uniquely determined irreducible representation 
of the symmetric group and then the representations attached to associated 


i 

= 


444 F. D. MURNAGHAN. 


diagrams, or partitions of n, are termed associated ; a representation attached 
to a self-associated diagram or partition being termed self-associated. We shall 
denote the partition associated with (A) = (Ai, Az, Ax) by (#) = (Ha, 
and it is an immediate consequence of the definition that py, =k, 7 = 4A,; it is 
also clear that there are a, ones, a twos, & threes etc. in the partition (,) 
where a, = (Ai — Az); = (Az —As),° = An OF, equivalently, 

The number NV) of elements in the class («) is readily found. If any 
such permutation is written down with the a, unary cycles appearing first, 
the a, binary cycles next, and so on, we obtain by mere permutation of the 
letters n! permutations all in the class («). But there are repetitions due to 
the fact that each cycle may begin, without changing it, with any one of its 
letters and to the fact that the a, r-cycles may be permuted amongst themselves 
without affecting the permutation. Hence 


n! n! 
N ca) p=n 1% ! ! a 
°@,12%-aol> > 
II %! | 
p=1 
If (s,,° *, Sn) are indeterminates the expressions == are 


class functions defined over the symmetric group and we use them in the 
definition of the simple characteristics of the symmetric group: 


1 
(8) = = at N aX 
* (a) 


We shall denote the principal characteristic (i.e. the simple characteristic 
corresponding to the identity representation) by gn(s) so that 


: 


These polynomials (n = 1,2,- - -) in the indeterminates (s,, s2,- -) are the 
bricks with which will be built the characters of the irreducible representations 
of the symmetric group and we write out explicitly the first seven of them: 


qi(8) g2(s) = + 82); q3(8) + 33,82 + 253) 
= {s,4 + 65,782 + 83,8; + 382? + 654} 

1 
5! 


gs(8) = = {81° + 10s,°s, + 208,783 + 1581827 + 308184 + 208253 + 2455} 


E 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 445 


1 

= 6! {s,° 158,482 40s,°s5 9081754 1205815283 

= {17 + 708,48, + 1053,%s,? + 210s,%s, + 4208,2s985 


+ 504s,?s5; + + 630s,s.8, + + 8405156 
+ 21052783 + 5048.8; + 420838, + 720s;}. 


The terms are arranged so that s,’"s,"- - - comes before s,"s2"2- - - if the first 
non-vanishing number of the set m;—,, mz—M2,°-* is positive. The 
polynomials gn(s) furnish at a glance the structure of the corresponding sym- 
metric group. Thus from gg(s) we see that the 6! = 720 permutations of the 
symmetric group on 6 letters divide into 11 classes there being 45 elements, 
for example, in the class « = (2,2,0,0,0,0). It is clear from the defining 
formula (4) that the polynomials gn(s) satisfy the interconnecting relations 


94m 
Qn-1 3 Qn-2 3 4qn-3 and, generally, 
1 
5 = — Gn-p> =e 


where, to secure the universal validity of these formulae, we define go(s) = 1; 
g-s(8) We shall see that the relations (5) have the 
following significance; they enable us to construct, in a very simple manner, 
the characters of any class of the symmetric group on n letters which contains 
at least one cycle on p letters, p—=1,-- -,n—41, from the, supposed known, 
characters of the symmetric group on (n — p) letters. 

Since any cycle on p letters may be written as the product of p — 1 binary 
cycles (= transpositions): E.g., (1234) = (12) (13) (14), the order of the 
factors being from left to right: every cycle on an even number of letters is an 
odd permutation and every cycle on an odd number of letters is an even one. 
Hence all permutations in a given class (@) are either even or odd and we may 
speak of even or odd classes; a class («) being even when a + %-+ %-+ °° ° 
is even and odd when it is odd. Now the symmetric group on n letters possesses, 
in addition to the identity representation, a second one-dimensional representa- 
tion; namely that one which attaches to each even permutation the number 1 
and to each odd permutation the number —1 (this being merely a sophisti- 
cated way of saying that the product of two even or odd permutations is even 
whilst the product of an even by an odd permutation is odd). This representa- 


Dp 


446 F. D. MURNAGHAN. 


tion is known as the alternating representation and the corresponding simple 


characteristic is known as the alternating characteristic; we shall denote it by 


an(s) so that 
(-- 1 coe $1 a4 Se a2 Sn an 
ma(8) = 2 2 n 


(6) Tn (81, Sn) = Gn{S1, — 82, 8g, 8&,° *). 


implying 


Before passing to the question of associating with each partition (A) of n 
a (reducible) representation of the symmetric group on n letters we find it 
convenient to remark that the characters y,,)4 of the irreducible representations 
are all real (it appears in the sequel that they are all integers, positive, nega- 
tive or zero, but this fact does not lie on the surface as does the fact of their 
reality). Indeed since the reciprocal of a cycle is the same cycle written in 
the reversed sense: E.g., (1234)-* = (1432): each class (a) contains the 
reciprocal of each of its permutations so that the character of any element is 
the same as the character of its reciprocal. But every representation of any 
finite group is equivalent to a representation by means of unitary matrices and, 
the reciprocal of a unitary matrix being its transposed conjugate, its trace is 
the conjugate complex of the trace of the original matrix. Hence the char- 
acter x(g"') (for any representation of any finite group) is furnished by the 
relation x(g") =xX(g). For the symmetric group we have, in addition, the 
relation x(g") = x(g) and the two relations together imply x(g) = x(g) 
i. e. the reality of the characters of any representation of the symmetric group. 
We may, therefore, drop the conjugate complex sign in the explicit expressions 
for the simple characteristics and write 


(a) 8, \% 

In order to associate with each partition (A) of n a (reducible) representa- 
tion of the symmetric group on n letters, we have merely to imagine the n 
letters placed in compartments or boxes containing, respectively, A1, A2,* °°, Ax 
letters and then to consider the subgroup H of the symmetric group @ which 
consists of those permutations which do not send any letter out of its box. 
This subgroup is of order M=d,!A2!---Ax! and a typical permutation 
of it is of the form P = P,P,- - - P;, where P; denotes a permutation on the 
letters of the j-th box (7 =1,2,---,). Since the various P; operate on dis- 
tinct letters the order in which the factors P; are written is indifferent and we 


| 
| 
i 
4 ( 
¢ 
| 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 447 


agree to write them in the natural order Py. If = 
denotes the cycle structure of P; then the cycle structure of P is furnished 
by the formula 


k 
Oy = ; (p = 1,2, 
j=l 
Hence 
k 
j=1 
and so the principal characteristic of H is 
1 k 
II 


where >) denotes summation over the Aj! permutations on the letters in the 


j-th box. On writing this principal characteristic of H in the form 


it appears as This product is, accordingly, a com- 
pound characteristic of the symmetric group G on n letters; namely, the char- 
acteristic of the representation of G furnished by the permutations of the cosets 
of H in G. Since the representation is by means of permutation matrices its 
characters are integers positive or zero and so the components of the char- 
acteristic (compound) q),(s)-- -9,,(s) of G are integers positive or zero. 
We shall denote by the representation (reducible) of the 
symmetric group on n letters whose characteristic is q),(s) qx,(8)- 


A(A,,- - -,A,) is sometimes referred to as a tensor representation for a reason that 
will be clear from the following examples: 


1. If k=2, so that n is partitioned into two parts, 4(A,,A,) is of dimension 


n! 
i, = X1a,! and the cosets of H permute like the products of n letters (#,,---,@,) A, 
at a time. These products may be regarded as a basis in a carrier “ tensor ” space of 


(;,) dimensions in which A(A,,4,) is presented by means of permutation matrices. 


Thus for n = 5, (3,2) there are 10 products the characters of 
4(3,2) are the number of these products which are left ievestiias by the permutations 
of the various classes and are at once seen to be (10,4, 1,2,0,1,0) which checks with 
the result 
1 
9,(s)9,(s) = {10s, 5 + 408,38, 208, 26, + 308 ,8,2 + 208, 8 3} 


5! 


2. 4, =n—2,r4,=1, 4, =1. The cosets permute like the products For 


448 F. D. MURNAGHAN. 


n = 5 A(3,12) is of dimension 20 and its characters are (20, 6,2,0,0,0,0) (the per- 
mutation (12), for instance, leaving invariant the six products v,7,?, 


a 3% 4 
@,2@,, 7.0.2); a result which checks with 


5 
1 
4, (s)4,2(s) = {208,5 + 608,38, + 408,26, }. 


It is clear that the space spanned by the expressions w,2,(v, + #,) etc. is an invariant 
subspace of the carrier space of A(m — 2, 1?) as is also the space- spanned by the expres- 
sions 80 that 4(n— 2, 1*) is in general reducible. Similarly 
the basic tensors for A(n — 3,2,1) are for A(n —3, 18) they are ; 


for A(n— 4,3,1) they are 5 for A(n — 4, 22) they are ; for 


A(n— 4, 2,12) they are w,3x,2(#,7,) and soon. The attempt to solve one of the main 
problems of the present paper; namely, the analysis of the reducible representation 
A(A,,- -+,A,) into its irreducible components, by the geometrical method of “ tensor 
representations ” soon becomes hopelessly complicated and we shall make no use of this 
geometrical viewpoint. 


The subgroup H whose elements are P = P,P,- - - Py may be termed the 
direct product of the subgroups G,, G2,- - -, Gx where Gj; permutes only the 
letters in the j-th box, leaving all the other letters fixed (7 —1,-- -,k); 
so that G; is of order Aj! We indicate the direct product relationship thus: 
H =G, X G2 X** +X G and observe that if Tj is any representation of G; 
(j =1,---,k), the Kronecker product T, X X- X Ty is a representa- 
tion of H whose characters are the products of the corresponding characters of 
T,,::°,Ty. If ¢;(s) is the characteristic of G; associated with TI; the char- 


k 
acteristic of H associated with T,; X T. X- - - Ty is, accordingly J] ¢;(s) and 


this furnishes a compound characteristic of G; (when the representations I; 
(j =1,2,---,k), are all the identity representation ¢;(s) = q),(s) and we 


k 
recover the compound characteristic J] q,,(s) of G). We shall be particularly 


interested in the sequel in the case k —2; I, will be an irreducible repre- 
sentation of the symmetric group on A, letters and [2 an irreducible repre- 
sentation of the symmetric group on A, letters. The compound characteristic 
of the symmetric group on n =A, + dz letters obtained in the manner described 
above corresponds to a reducible representation of this group on n letters which 
may be termed the direct product of T, and Tf, (and denoted T,-T.). Our 
problem is the analysis of T, - IT, into its irreducible components. The dimen- 
sion of the direct product I, - TI. is the product of the dimensions of I’, and I: 
by n!+-A,!Az!, since its characteristic is ¢,(s)¢2(s) and the coefficient of the 
highest power of s, in this product is ~ a times the product of the dimen- 
sions of and 


q 
| 
| | 
i 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 449 


The formula of Frobenius for the simple characteristics of the sym- 
metric group and its modification by Schur. If we suppose the indetermi- 
nates (8:,° * *, Sn) which occur in the expressions for the characteristics of the 
symmetric group to be the power sums of other indeterminates (2:,- - -, 2n): 


Sp + (k =1,---,n) 


the principal characteristic gm(s) of the symmetric group on m <n letters 
becomes, when expressed in terms of the indeterminates (z,,- - -,2n), merely 
the complete homogeneous symmetric function pm(z) of degree m in the n 
variables (z:,- - -,2n). The first few of these functions are 


pi(s)—= 321; po(s)—= + ps(B) + + 
and they are the coefficients of the development of the generating function 
f(t) = {(1— at) (1— et) - - (1 — in a power series in ¢: 


F(t) — pols) + + pals) 
But 
so that 


f(t) = e8st . 
j=0 a,! a. ! a;! 1 9 j 


and this implies p;(z) q;(s) (j=0,1,2,---,m). The homogeneous 
products p;(%) = (2, zat), t, +t. +: +--+ are intimately 
connected, in a reciprocal manner, with the elementary symmetric functions: 


Oo(3) = 1; = 32,3 o2(8) on(B) == 


either set being expressible as polynomials with integral coefficients in the other 
set. In fact the generating function for the elementary symmetric functions 
is 


g(t) (1 + 


450 F. D. MURNAGHAN. 


and on taking logarithms we find 
log g(t) = s,t — + —- - 


so that oj(z) = — S2, 83, — S4,° == 2j(8). In other words the alter- 
nating characteristic of the symmetric group on 7 =n letters becomes, when 
expressed in terms of the indeterminates (2,,- - -,%n), simply the elementary 
symmetric function oj(%). The two generating functions f(t) and g(t) are 


such that g(t) {f(—#)}~ and hence { Sioyt!} (—1)* pat*} —1 and 


this yields the series of relations 
= 13 Gopi — ipo = 93 + =0;°°°. 


These may be expressed by the statement that the two matrices 


Po Pr °° * oo 


Po To 
(j 
are reciprocal (the elements below the diagonal in each matrix being zero). 


Since p) = 1 =o, the determinant of each matrix is unity and so each element 
of either is a cofactor of the other; thus 


Pi Pe Ps 
Po Pr 


and so in general. o, is an r rowed determinant whose diagonal elements are 
all p, the non-diagonal elements being obtained by increasing the suffix carried 
by p methodically by one as we move from each column to its right-hand neigh- 
bor and decreasing this suffix by one as we move from each column to its left- 
hand neighbor (it being understood that p,—p2—=:-::=0). The result 
of this calculation needed for our immediate purpose lies on the surface: the 
symmetric functions pj(%) may be used instead of the elementary symmetric 
functions o;,(%) as a basis for symmetric functions. More particularly any 
symmetric polynomial in z with integral coefficients may be expressed as @ 
linear combination of products of the functions », with integral coefficients; 
and if the polynomial is homogeneous of degree n the products entering 
the linear combination are of the type pr,(%) where 


i 
| 

° 
¢ 
4 
0 0 
| 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 451 


bi =n. Since pj(z) =qj(s) this furnishes the basic result: 
any homogeneous symmetric polynomial of degree n in the variables (z,,° --, zn), 
with integral coefficients, is, when expressed in terms of the variables (s;,- - - , Sn) 
a generalised characteristic of the symmetric group on n letters. Frobenius’ 
essential contribution to the theory was the discovery of those particular sym- 
metric functions of z which yield the simple characteristics; and to Schur we 
owe the recognition of the importance of a very elegant and useful expression 
of them (due to Jacobi) as determinants whose elements are the functions 
pi(3) = 95(8), 

The expression just given for o,(z) as a determinant whose elements are 
members of the set p;(%) is merely a special case of a general reciprocal rela- 
tionship between determinants whose elements are members of the set 
oj(%) == 7;(s) and determinants whose elements are members of the set 
pi(%) = 4;(s) ; which merely reflects the fact that the matrices P; and 3; are 
reciprocal. We shall need a special case of this relationship and it is convenient 


to derive it here. Let (A) = (Ai,° - +,Ax) be a partition of n and consider 
the determinant 
Pr 
Pro-1. Pro Prork-2 
Py-k+1 


This determinant is a certain k rowed minor of the matrix P),,.; in fact the 
one obtained by erasing the first A, columns and retaining the 1st, the 
A: — Az, + 2)-nd, the (A; —Az + 8)-rd,: - and the (A, —Ax + &)-th rows. 
Save for a question of sign, into which it is profitless to go since it can be 
settled in a trivial manner later, {A} is, therefore, equal to that minor of the 
reciprocal matrix %),., which is obtained by keeping the first A, rows and 
omitting the 1-st, the (A, —A,—2)-nd, the (A, —Az + 3)-rd- and the 
(A: —A, + &)-th columns. Since Ax > 0, the last column of %),,x is kept and 
the suffix attached to the o in the lower right-hand corner is k (for the minor 
has A, rows and the suffix attached to the o ir the upper right-hand corner is 
\, + k—1 whilst the suffixes diminish methodically by one as we step from 
each row to its neighbor below). Counting from the last column the first 
column omitted is the (A, + 1)-st; and the suffixes of the last A; diagonal 
elements of the minor of 3),.x in question all equal /; since the second column 
omitted is the (Ax-: + 2)-nd, counting from the last, and so on, the next 
diagonal suffix, counting upwards to the left, is less than & by the number 


452 F. D. MURNAGHAN. 


of \’s that equal 4. Continuing in this way we see that the diagonal suffixes 
of the minor of constitute the partition = of n which 
is associated with the partition (A) of n. For instance if n = 4 and (A) = (2, 1?) 
so that (u) = (3,1) we have proved that 


Pe Ps Pa 
{2,17} = | po Pi p2 | = + 
0 Po Pr 


The negative signs may be removed from the o’s carrying odd labels by 
changing the signs of all columns having a o with an odd suffix at the bottom 
and following this by a change of sign of all rows having a o with an even 
suffix at the end. On reflecting the o minor about its secondary diagonal (an 
operation which does not affect the value of the determinant) we find 


most. 


where the non-diagonal elements of each determinant are filled in by increasing 
methodically the suffixes by one as we move to the right (and diminishing them 
by one as we move to the left) ; it being understood that a p or o carrying a 
negative suffix is to be replaced by zero and that the partitions (A) and (y) 
of n are associated. That the undetermined sign is +, rather than —, is im- 
mediately evident when we recall that p;(%) = q;(s) 


oj(%) = 7;(8) = (51, — $2, 83, — * *). 


On setting = 1, =0, pj(%) =@;(8) takes the value 
1/j! as also does oj(%) = 7;(8). And on writing 


Ai + (k—1) =1,; de + (kK —2) = 


it is clear that the determinant {A} becomes the quotient by 1,! 1.!- - - i! of 
a k rowed determinant of which the element in the j-th row and p-th column 
is 1;(l; —1)---(l; +p+1—k) there being k— p factors (the elements 
in the k-th column being all unity). Since the element in the j-th row and 
the p-th column is a polynomial in 1; of degree k — p (the coefficients of the 
polynomial being independent of the row number j and the coefficient of the 
highest power being unity) it is at once clear, on subtracting from each column 


fF an 


‘ 

| 

O4 

Oo — 93 

] 

0 

W 

a 

ex 
ac 
wl 
vii 

eq 

(r 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 453 
an appropriate linear combination of the succeeding columns, that the determi- 
nant is equivalent to the Vandermonde determinant whose j-th row is 
(1j*-1, 1j*-?,- - -,1;,1). Its value is, therefore 


A(1) (lp — Iq). 
p<aq 
Since the partition (A) of mn was arranged in non-increasing order: 
A, 2 As > 0 the numbers (1) are arranged in descending order: 
0. Hence when s, = 1, s, = 8; +S, =0 
both the determinants 


| 
| and 
| 


| 


are positive; they must accordingly be equal and not one the negative of the 


other. Moreover if we set + (7 —1) == +, =m; the two numbers 
A(l) +i,!l,!- and A(m) + are equal. Finally we 
remark that the theorem of the present paragraph may be stated in the fol- 
lowing convenient manner. Denoting by {A}* the result of changing the signs 
of $4,° in {A} then 


= 


where {A} is the determinant 


qn, (8) | 
qr.(8) 
qr. (8) | 


and (4) is the partition of n associated with (A). 
We now proceed to the determination of the simple characteristics of the 
symmetric group on n letters (remembering that any homogeneous symmetric 


polynomial of degree n in the n indeterminates (z,,- --,2n) yields, when 
expressed in terms of their power sums (s,,°°°,8,) a generalised char- 


acteristic of this group). What is needed is a criterion which will decide 
whether or not a set of generalised characteristics are simple, and this is pro- 
vided as follows. Let %» (p=1,---,17), denote the characters of the r non- 
equivalent irreducible representations of the symmetric group on n letters 
(r being the number of partitions of n) so that pp(s) = (8° %p) are the r 
simple characteristics. Let t be a second indeterminate and form the products 
2 


| 


F. D. MURNAGHAN,. 


dp(8) -d,(t) and then sum these products as p runs over the values 1, 2,---,r. 


We have 


ls 
nN. p 


where the summation is over all elements P of the symmetric group 
= (%,,° **,%n) indicating the class to which P belongs). Similarly 


where indicates the class to which Q belongs. On forming the 
product and summing with respect to p we have a triple summation ; namely 
with respect to P, with respect to Q, and with respect to p. Performing first 
the summation with respect to p we obtain zero unless Q belongs to the same 
class as P (owing to the orthogonality relations amongst the simple characters). 
For a fixed P there are N,q) choices of Q, namely all the Q’s in the same class 


as P, and summation with respect to Q gives } Niayxp'* xp” =n! There 
Dp 


remains only the single summation with respect to P and we find 


where u = st in the sense that = Us = Un =Sntn. The real 


force of this result lies in the fact that its converse is true in the following 
sense: suppose we have r generalised characteristics F',(s) 


possessing the property that > F,(s)F,(t) = qn(st); then each of these 
p=1 

characteristics is either a simple characteristic or the negative of one. In fact 


I’, (8) = Cp%ba(s), where the coefficients cp’ are integers, positive, negative or 


zero and so 


Dd bp(8) hp (t) = qn(st) = F,(s)F,(t) = D 8) 
p=! 


pol 


Now the 7 simple characteristics are linearly independent; for a hypothecated 
relation c%}a(s) =0 would imply c%,—0 and this would imply c’ =" 
(p=1,---,17), owing to the orthogonality relations (1). Equating then. 
the coefficients of ¢,(s) on both sides of the equation just written, we obtail 


454 
if 
| 
| 
ft 
q 
if ( 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 


dg(t) = p(t) 
p= 


and this implies, again on account of the linear independence of the simple 


5 


characteristics, 


‘ 
= 0; 7A Q3 = 1. 
p=1 p=1 
Since the numbers c¢,% are integers it follows from the second of these two 
equations that, for a fixed gq, all ¢,4 vanish save one which is + 1; and then 
the remaining equations show that for a fixed p all ¢)4 vanish save one which 
is +1. In other words the generalised characteristics F,(s) are merely a 
rearrangement of the simple characteristics $)(s) followed, possibly, by a 
change of sign of some of them. 

Let (v1,° * +, Un) be n integers, positive or zero, no two of which are equal, 
supposed arranged in descending order of magnitude: > v2 > Un, 
and denote by A(v,° + -,Un) the n rowed determinant of which the elements 
in the j-th row are the v/-th powers of the indeterminates (2,,° °°, 2n) 
(j=1,---,n). When =n—1, ve = n—2,- + +, Un =0, Un) 
is the Vandermonde determinant whose value is the difference product 
A=A(z) (2; — 2). It is clear that contains A as a 

factor and since both A(v,,---+,v,) and A are alternating functions of 2 
the quotient is symmetric and it is at once seen to be a polynomial of degree 
(0, + ve ++ +++ vn) — (n—1+n—2+1-+ 0) with integral coefficients. 
If, then, (A) = (Ai,° + -,An) is a partition of n and we write 


= j, -+- (n—1), Vo =Ay + (n — 2),- 


the quotient A(v;,- vn) is a symmetric polynomial of degree n, with 
integral coefficients, in the indeterminates (z;,° Zn); it furnishes, therefore, 
when expressed in terms of the power sums (s;,° ° *,S,), a generalised char- 
acteristic of the symmetric group on n letters and the basic result of Frobenius 
is to the effect that the characteristics obtained in this way are simple. Let us 
denote the quotient A(v,° - -,Un) A by {A}; then in order to derive the 


result of Frobenius we have first to show that § {A}(s) {A} (€) = qn(st) and 


then that the coefficient of s, in {A}(s) is positive. We first remark that the 
relations 


455 
r 

p 

y 

e 

y 

st 

e 

1 


456 F. D. MURNAGHAN. 


imply sjtj = (p= Hence if we denote 
the n* quantities zpyq by zy the relations s > 2, t—> y imply st — ay and, in 
particular, gn(st) = pr(zy); so that the first part of our problem may be 
re-phrased as follows: we must show that {A}(z) {A}(y) = pn(ay) the 
summation being over the r partitions of n. i 

To do this we first consider a determinant of order n of which the element 
in the i-th row and j-th column is (a; + 6;)~?. On subtracting the first column 
from each of the others and removing the common factors 


(b, — bz) (b; bs) (bs — bn) = (0, + a;) + a2) (b1 + an) 


we obtain a determinant of which the elements in the 1-th row are 1, 
(a; + +, On subtracting the first row of this determinant 
from each of the others and removing the common factors 


(dy — 3) * * * Gn) + De) + 03) (1 + Dn) 


(a, 


we obtain a determinant of order n —1 of which the element in the i-th row 
and j-th column is again (a; + b;)~? where now i, 7 run from 2 to n instead 
of from 1 to n as before. It follows at once that the n-th order determinant, 
of which the element in the i-th row and j-th column is (a; + 6;)-*, has the 
value A(a)A(b) + TI (a; + ((=1,---,n, 7 where A(a) 
denotes the difference product (a@;— (@n1—4n) (a result due to 
Cauchy). On writing a; = a;", b; =— 8; this result of Cauchy appears in 
the following equivalent form: the determinant of order n of which the element 
in the i-th row and, j-th column is (1— «;8;)7* has the value 


A(a@)A($) + (1 — B;). 


But if A denotes the n &K o matrix of which the elements in the 1-th row 
are (1, - and B the o matrix of which the elements in the 
j-th column are (1, Bj, B;?,- the product A- Bis an n X n matrix of which 
the element in the 1-th row and j-th column is 1 + a8; + a;?8;? +- °° oF 
(1—,8;)-*. The determinant of the product AB may be found by selecting 
any n-th order matrix from A, multiplying its determinant by the determinant 
of the corresponding matrix from B, and adding all products so obtained; 
that the number of products is infinite need cause no concern since @ and § are 
indeterminates and we may regard them so chosen that the components 4%, Bj 


al 


80 


| 
if 
d 
i 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 457 


are all < 1 in numerical magnitude so that the infinite series which appear 
are all absolutely convergent. All determinants of order n selected from the 
matrices A and B are of the type A(p,° - -, pn) where we may, without lack 
of generality, agree that p: > po >- +> pn > 0. Hence we have 


DA Pn) (%)- Pn) (B) = A(%)A(B) — a5). 
(p) 
On setting 8 St i.e. B, = 8,t, Bo = Sat, Bn —=Snt, where ¢ is an in- 
determinate, we have A(p.- pn) (B) = A(pi,* pn) 


and on writing 


we find 


=> pr (2B) 
0 


On equating coefficients of ¢” we obtain 
{A} (2) (B) = pa (2B) 


where the summation is over all partitions (A) of n. This proves that the 
symmetric polynomials {A}(#), furnish, when expressed in terms of the power 
sums (S:,° * °,Sn), either simple characteristics or the negatives of these; 
all simple characteristics being obtained in this way. To show that we have 
actually the simple characteristics, and not the negatives of any of them, we 
must show that the coefficient of s," in {A}(z) > 0. To do this we shall first 
derive Jacobi’s expression for {A}(z) as a determinant whose elements are 
members of the set p;(#) =q;(s), jn. Before doing this we remark that 
Frobenius stated his result in a slightly different form. From 


(8) == A(v;,° > Un) A(z) 
and (3) we have 


(A) 
80 that is the coefficient of A(v,v2,° in the development of 


A(z)s™ Il (2 Zq) Sn™, 
D<q 


| 

| | 
L 
t 

pi — (n—1) po— —2) Pn=An 

e 


458 F. D. MURNAGHAN. 


To obtain Jacobi’s expression it is necessary to point out some trivially evident 
properties of the homogeneous products p;(<,,- - +52). The generating function for 


these products was {(1—2,t) 

0 


It follows, on multiplication by (1—z,t), that 


x x 
0 


(215° + or, equivalently, 


1’ 


so that p;(2,,- + +> 


formula ” to both terms on the right-hand side we obtain 


a relation which may be written in the form 


which suggests the relation 


n 


That this relation actually does hold is readily proved by induction; for assuming its 
validity for a stated value of m its validity for m + 1 follows at once. Thus 


P; —= Px (z,,) j_-m-1 + Pj_m (2,, ) } 
Since the relation (8) is true for m= 1 it is true for every positive integer; it being 
always understood that a p with a negative label is assigned the value zero. It is also 
understood that all Daley" - +,2,) are assigned the value zero when s < 1. 
W e need one other property of the homogeneous products (25+ + Writing 


in the equivalent form 


we find 


and, on interchanging z, and ®ns and subtracting, 


1 


i 
| 
( 
] 
t 
q 
\ 
{ 
4 
| 
} 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 459 


We are now ready to carry out Jacobi’s transformation of the symmetric 
polynomial A(v,,° A(z). The determinant has 
zj"' = pv, (z;) as the element in the i-th row and j-th column. On subtracting 
the last column from each of the others and removing the factors 


(4 — Zn) (Z2 — Zn) (Zn-1 — Zn) 


we obtain an n-th order determinant of which the element in the i-th row and 
jth column is py,-1(2j,%n) (J =1,° —1); the element in the i-th row 
and n-th column remaining py,(zn). We now subtract the (m— 1)-st column 


from each of the columns which precede it and remove the factors 
(21 — Zn-1) (22 Zn-1) 


obtaining a determinant of which the element in the i-th row and j-th column 
iS Pv,-2(Zj, Zn-15 Zn) (J =1,° +,n—2), the elements in the last two columns 
remaining Pr,-1(Zn-1,2n) and py,(Zn). Proceeding in this way we see that 
{\}(z) =A(%4,° Un) + may be expressed as a determinant of order 
not which the element in the row and j-th column is py,--j) (2j5 Zj+19 5 Zn) 
(j=1,2,:-°,), (it being always understood that the p’s with negative 
labels vanish). Upon multiplying this determinant by unity in the form 
of an n-th order determinant of which the element in the i-th row and j-th 
column is pj-i(@:,° * *,2:) (so that the diagonal elements are all unity whilst, 
the elements below the diagonal vanish) and using (8) (the multiplication is 


done row into column as in matrix multiplication) we find that 
A(1,° * Un) + A(z) 


may be expressed as an n-th order determinant of which the element in the 


i-th row and j-th column is pr,-cn-j)(%) = Yv,-n-j) (8). On setting 


we see that {A}(z) is expressible as an n-th order determinant whose diagonal 
elements are gy,(s) the other elements in any row being obtained by methodi- 
cally increasing (decreasing) the suffix carried by q(s) as we move from any 
column to its neighbor on the right (left). If & is such that A, > 0 whilst 
Mist = =" = An = 0 the last n-— k rows of our determinant have unity 


i the diagonal and zero’s preceding the diagonal. Hence, and this is the 


460 F. D. MURNAGHAN. 


essential simplification, {A}(z) may be expressed as a determinant of order | 
of the type described above. The coefficient of s," is obtained by setting 
= 1, 82 ‘= 8», = 0 and turns out to be positive (the calculation having 
been performed already on p. 453). 

We restate the main theorem (Frobenius-Schur) of the present section 
as follows: Attached to each partition (A) of nm: A, +A.+-+-+A,—n, 
Ay =* ++ > O, is an irreducible representation D(A) of the sym- 
metric group on n letters. Its characteristic is the determinant of order k 


qn, (8) 
(10) {A} (2) = ga) (8) = 
qr. (8) 


(where the remaining elements of any row are obtained from the diagonal 
element by methodically increasing (decreasing) by unity the suffix carried 
by q(s) as we move from any column to its neighbor on the right (left) ). The 
characteristic of the irreducible representation D(y) which is attached to the 
associated partition () of n is 
qu(s) ™, (8) | 
qu, (8) | 


S82, 83, — 84, * *). In other words, the characters 


so that (s) = (81, 
of D(z) which correspond to even classes are the same as the characters of 


D(X) whilst those which correspond to odd classes are the negatives of the 
characters of D(A); so that in constructing the character tables it is unneces- 
sary to give the characters of D(y) if those of D(A) have been already given. 
The common dimension of the irreducible representations D(A), D(p) is 


(11) nm! A(l) 
where 
=), + (kK —1), le + (kK—2),: -, and A(l) 


The construction of the character tables for the various symmetric 
groups. From the expression (10) for ¢,(s) and the relations (5) it follows 
at once, on applying the rule for differentiating a determinant, that p0¢,,) (8) /p 
is the sum of & determinants of which the j-th differs from ¢,,)(s) in that the 
suffixes of the q’s in the j-th row are all diminished by p; (p=—1,- - -,”)- 


| 

| — 

if 

| 

{ 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 461 


The suffixes of the diagonal elements of this j-th determinant, namely 
(As, A2,° A; —p,° * *,Ax) add up to n—p but they will not, in general, 
constitute a partition of n — p for A; — p may well be negative and even if it 
is not the normal non-increasing order may well be destroyed. However, an 
interchange of two adjacent rows of our determinant, which amounts only to 
a change of its sign, changes two adjacent diagonal suffixes by interchanging 
them and at the same time decreasing the one which was originally on the 
right by unity and increasing the one which was originally on the left by unity. 
By doing this sufficiently often the sequence (Ai, A2,* — p,* *, Ax) May 
be put in non-ascending order. If it then ends in a negative integer we discard 
the corresponding determinant, whose last row consists entirely of zeros; if it 
ends in one or more zeros we ignore these as the corresponding determinant has 
units in the diagonal places in the last one or more rows, all preceding elements 
in these rows being zero. We shall understand by Ax} 
the simple characteristic of the symmetric group on n — p letters (p = 1, 2, 

*+,;nm—1) corresponding to the partition of n— p obtained in this way 
provided the number of necessary interchanges is even and the negative of 
this simple characteristic of the number of interchanges is odd. Since 
=—{---b—1,a+1---} it is clear that {---a,b---}=0 
ifb=a-+1; similarly {-- 
andsoon. With this understanding of the symbol {A,, ,Aj— Ax} 
we have, then, 


(8) 


(12) as, 


k 
j=l 


On writing out ¢(s) thus: 


(a) 7 a 
ne\ 


(a) ! Qo ! 
we have 


OSp (a’) a,!- “Sel: ! 1 p nN 


, a 
where On = %—1 and (a’) = is that class of the 


symmetric group on n — p letters which contains one less cycle on p letters 


than the class (a) of the symmetric group on n letters. On equating coeffi- 
cients of s,%- - 


+ on both sides of the equation (12) we find 


| 
n 
(a 
XA) 


462 F. D. MURNAGHAN. 


a relation which we find convenient to write in the form 


k 
j=1 


This basic formula enables us to write down at once those characters of the 
symmetric group on n letters which correspond to a class containing at least 
one cycle on p letters when the characters of that class of the symmetric group 
on n—vp letters which contains one less cycle on p letters are known; 
(p=1,---,n—1). The same formula yields directly the characters of the 
class containing but one cycle on letters ; since Ay = 1, Az = As =A, = 1 
we have A, +(k—1)S n the equality holding only when A, = =A = 1 
and so A, —n + (k —1) =0 and this implies {A, — n, +, Ax} = 0 unless 
ho —1 since then the last term, when it is rearranged in 
non-increasing order, namely A; — n + k — 1, < 0. The other terms 
As — As, * * Ax} ete., are zero for all partitions (A) since 
ho —n+ (k—2) <A,—n+k—1< 0 and so on. Hence the characters 
of the class containing but one cycle on n letters are zero unless the partition 
(A) is of the type (n—k-+1,1**). On subtracting n from the first 
number n —k + 1 of this partition of n we obtain {1—>k, 1**} and k—1 
rearrangements are necessary to write this as {0*} which = 1. Since 


n Ido (8) 
OSn 

we have 

(14) (—1)*"; all other — 0. 

This formula has the definite advantage, over the recurrence formula (13), 
that it tells us explicitly, without referring to data concerning the symmetric 
group on a lesser number of letters, the characters attached to a particular class 
of the symmetric group on n letters, namely the class containing but one cycle 
on n letters. The formula (11) of Frobenius giving the dimension of D(A), 
or, equivalently, the character attached to the unit class, has a similar ad- 
vantage. We may combine our recurrence formula with the dimension formula 
of Frobenius to determine directly characters of classes containing one or more 
unary cycles. E.g., suppose we wish to calculate the characters of the sym- 
metric group on n = 20 letters corresponding to the class containing a, = 1? 
unary and a = 1 cycle on 8 letters. We shall illustrate by considering the 
representation D(9,6,3,2). Applying our recurrence formula with p=8 


we obtain 


| 
| 
| 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 463 


{9, 6, 3, == {1, 6,3, 2a + (9, — 2, 3, Zar 
+ {9, 5, {9, 6, 3,— 


of the four terms on the right the first, third and fourth vanish; the first 
because 3 == 1-+ 2; the third because —5-+ 1 < 0 and the fourth because 
—6<0. There remains {9,— 2,3, 2}q —=— {9, 2,—1, 2}a = {9, 2, l}a 
and since « is the unit class the dimension formula of Frobenius yields, since 


12! 


(11, le, 13) = (11, 3,1), 11!3!1! 


(8) (10) (2) = 820. 

Similar, although not quite such convenient, formulae may be found for the 
characters of a class containing only cycles of the same length. E.g., let 
n= 2m and consider the class containing m binary cycles. The characters 
of this class are found by setting s.—1, s, in the ex- 
pressions for the simple characteristics ; it being clear that then q; = 0 if j is 
odd whilst gop = 1/2”. p!. Thus, for n = 12, the character of the class a. = 6 
of D(5, 4, 2,1) is 


(2%.3!)7 0 (24.4!)7 


96 6! | 0 (2? 0 a1)" {(4! 3! 
0 9-1 0 4 —(i !) } == — 5, 
| 0 0 ] () | 


Similarly the character of the class a, = 4 of D(6, 3") is 


@. 
34,4! | 0 320 | 


whilst the character of the class a, = 3 of D(%, 27,1) is 


| 0 (4%.2!)7 0 0 


| 


Two more examples will suffice ; suppose for n = 15 we wish the character of 
the class 2, 3 for the representation D(5,4,3,2,1). On setting 
= 82 = 0 all the q; vanish save those for which j is a multiple 
of 5 and (sp takes the value (5?. p!)-!. Then the desired character = 


) 
5 
| 
43 3! | == —— 3. 
| 0 0 0 
| 0 10 | 


F. D. MURNAGHAN. 


5*7 000 0 
0 0100 
5°.3!/;0 0001] =—150. 
0 1000 
0 0010 


If we wish, for a final example, to obtain for n = 12 the character of the class 
a, =1, a, 2, «,;=3 for the representation D(7,1°) we may proceed as 
follows. On applying our recurrence formula twice, first with p = 1 and then 
with p = 2, we find 


{7, 1°}a {6, 1° {7, 1*}a 
{4, — (6, + {5, 1*}a” {7, 17} a” 


where &” is the class, of the symmetric group on 9 letters, consisting of per- 
mutations each of which has three ternary cycles. Since this class is positive 
{4, 1°}a” = {6, 1°},” and we have merely to calculate {5,1*},” and {7, 17}q". 
We find 

0 (322!)2700 (3%3!)7 
1 0 0 37 0 


{5, 1*},” = 3°.3! | 0 1 0 0 377 =—2 
0 0 1 0 0 
0 0 01 0 


00 (3%.3!)4 
1 0 


so that the desired character is — 3. The most trivial example of this method 
furnishes directly the characters of the class a, = 1, @_1 = 1 of the symmetric 
group on n letters. All characters are zero save that of the identical representa- 
tion and those associated with the partitions 


A = n—k, Ao = 2, =—1 


and these have the value (— 1)**. 

The character tables for the various symmetric groups from n= 2 t0 
n = 10, inclusive, are given in the paper numbered (15) in the list of references. 
The character table for n = 11 is given in the paper numbered (16) ; (anyone 


464 

u 
| n 
dl 
T 
n 
di 
tk 
re 
di 
by 
se 
of 
| b 
H 
1 0, 
if 
as 

| 
W 
ty 
OF 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 465 


using this table should note that the characters of (5, 3, 1°) are given with the 
wrong signs for the odd classes). And the character tables for n = 12 and 
n= 13 are given in the paper numbered (17). 

We may derive by the method just described explicit formulae for the 
characters of those classes of the symmetric group on n letters which consist of 
% unary cycles and —1 cycle on p letters; (p= 2, 3,4,-- -). 
These formulae were given by Frobenius (4) for p=2 (the transposition 
class) and (5) for p= 3,4. Since they are of importance in the physical ap- 
plications we give their derivation here. We first remark that a partition (A) of 
n may be conveniently specified as follows: draw the principal diagonal of the 
diagram of the partition (i.e. the diagonal starting at the upper left-hand 
corner) and suppose it strikes s columns. Denote by b; > b, >: +: >b,=0 
the number of dots to the right of the diagonal in the rows 1,2,:- -,s, 
respectively, and by a; > ad. >--++>ds=0 the number of dots below the 
diagonal in the columns 1,- - -,s respectively. Then the partition is described 
by b= (b,,: - -, bs) and a= -,a@s) it being clear that the partition is 
self-associated when and only when ab. The number of dots in the first 
row and column together = b, + a, +1; when these are deleted the number 
of dots in the new first row and column = b, + a2.+ 1. Proceeding in this 
way we have n a, (b; +a; +1). It is clear from the definition that 


j=l 


bj =A; —j (j= 8), whilst the differences p — (p 
satisfy the inequalities 


Hence they are the complementary set to the set a, > a, >+- - : > ds in the set 
0,1,---,4—1. In fact A >O0 shows that a, —k—1 is not in the set; 
if % > 1, a, =k — 2 is not in the set and so on. The following will serve 


as illustrations of the definitions of b and a: 


(A) = (3, 27,17); s=2; b= (2,0); a= (4,1) 
(A) = (4,2,17); s=2; b= (3,0); a= (3,0) 
(A) = (4°,1); s=3; b= (3,2,1); a= (3,1,0). 


We denote, for convenience, by x:,)(p) the characters of the class a; =n — p, 
%=1 so that, for instance, y,,)(2) are the characters of the transposition 
class whilst x,,)(1) are the characters of the unit class (i.e. the dimensions 


466 F. D. MURNAGHAN. 


of the various irreducible representations). Our object is to obtain for ya) (p) 
(p = 2,3,4,- - +), an explicit formula analogous to (11) which furnishes 
x.) (1). The recurrence formula (13) tells us that x,,)(p) is the sum of the 
dimensions of the irreducible representations 


of the symmetric group on n — p letters (where we follow the previously agreed 
on convention for the restoration of the normal non-increasing order of the 
(A1,A2,° * *) when this has been destroyed by the subtraction of p). Writing, 
as before, 


1=A, + (k—1), ly = Ao + 


the dimension of D(A; — p, Ax) 18 


- 


and there are similar expressions of the dimensions of the other irreducible 
representations. On dividing through by 


(1) =n! (1, + (ier —k) kk! 
the quotient x,y) (p) +x.) (1) appears as a sum of & terms of which the first is 


If we write f(x) = («#—l1,)- - - (w—) this may be written as the quotient 
of 1,(4, —1) —p+1)f(—p) by —pn(n—1)--- (n—p+1)f'(h) 


where f’ indicates the derivative of f; hence 


Xa(1)  pn(n—1) (n—p +1) ja f’(1;) 


Now the analysis of the function x(a —1)- - - p) —f(2) 
into simple fractions yields a polynomial in x plus terms A; + (x4 —1,;) where 


Aj =1;(1; —pt+1)f(— p) +f (lj) so that 


xa (p) (1) = — ( > Aj) + 
j=1 


k 
On writing (2 = (1/r) + (1;/a?) - - it is clear that A; is the 
j=l 


coefficient of (1/x) in the development of 


ak wax Ga 


| 
| 
‘ 
| 
| 
i 
| 
| 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 


as a series of descending powers of x. The zeros of f(z) are the k numbers 
(1,,- so that, if y=a—k, the zeros of f(y-+ are the k numbers 
i.e. the k numbers Aj —j (j =1,---,). Of these the 
first s are the numbers (6,,: - -,6;) whilst the remaining k —s are the nega- 
tives of a; + 1 where the two sets a= and = (@e41,° 
together form the set 0,1,- --,(k—1). Hence 


j= 
It will be convenient to denote the function 
(y—bi) (y— bs) /(y ta +1) 

by F(y) and then f(z) =f(y +k) =F(y)(y+1)--:(y+k). The desired 

sum > Aj, being the coefficient of 1/z in the development of 

p) +f (2) 

in a series of descending powers of z, is, equivalently, the coefficient of 1/y 
in the development of this same function in a descending series of powers of y. 
But 
p) +f (2) 

and we have merely to seek the coefficient of (1/y) in the development of this 


function. An application of Taylor’s expansion yields 


F(y— p) ~F(y) =1— pF’(y)/F(y) + pF’ (y)/2! Fy) 
— (y)/3! F(y) +- 


on taking the logarithmic derivative of 


P(y) ty +0) 


467 
(p) 
shes 
the 
he 
1g, 8g k 
fig +b) — Th + 


F. D. MURNAGHAN. 


we find 
F’(y)/F(y) —[1/(y +a; +1)]} 
= (n/y’) + (¢s/y*) + (¢s/y*) +° °° 
where 


(we have availed ourselves of the relation > {bj + (aj +1)} =n). On 
j=1 


successive differentiation of this relation we find 


P’(y)/F(y) = {P'(y)/F(y) (y) 
(— 2n/y*) + {(n? — 8c5)/y*} + {(2nes — 4c4)/y°} + 
(6n/y*) + {(12c, — 6n?)/y*} 
/F (y) = / Fy) (y) / By) / 
(—24n/y) 
Hence 
F(y — p) +— pF(y) =— (1/p) + (1/y*) + (pm + 
+ (2c, + 3pcy + 2np? — pn?) /y* 
+ 2pe.+ p(2p—n)cs-+ np*(p—n) 
This has to be multiplied by y(y—1)---(y—p-+1) and the coefficient 
of y* in the product then determined; equivalently we may multiply by 
(y—1)---(y—p-+1) and determine the coefficient of y*. This coeffi- 
cient yields, when divided by n(n —1)- - - (n—p-+1) the desired quantity 
xn (Pp) +xa)(1). We carry out the calculation for p = 2, 3, 4. 


p=2; Xr) (2) (1) = (0 + C3) n(n —1). 


Since 
{bs — (a + 1)*), (bs + (4 +1)) 


so that 


(15) xa (2) + xa (1) = [ + 1) —aj(aj + 1)}] + n(n —1) 


p = 3; here we must multiply by (y— 1) (y—2) and the coefficient of y” is 


468 
(3 
( 
al 
so 
a F 
CCC 
4 al 
( 
al 
H 
7 
| 
q 
W 
| 
| 
| 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 469 


(2c, + 8c, — 3n? + 4n)/2 =4[ + 3b;? + b;) 
=1 


+ 2(a; + 1)? + (a; +1)} —3n(n—1)]. 
Hence 
(16) (3) (1) = LE + 1) +1) 
j=1 
+ aj;(a; + 1) (2a; + 1)} — 3n(n — 1) ] + 2n(n— 1) (n— 2) 


p= 4; here we must multiply by 
(9— 3) — + lly —6 
and the coefficient of y-? is cs; + 2c, + ¢; — 2(2n — 38) (¢3 + n) 


=X + 2(2n—3) (65 + (as + 

(17) xa (4) (1) = + 1)? + 1)} 
—2(2n—3) (n—2) (n—8). 


For higher values of p it is more serviceable to use the recurrence formula 
(13) as the expressions deduced by the manner described above become too 
complicated. The formula (15) of Frobenius may be readily transformed into 
an equivalent formula due to Hund (see reference (28) ). We have b; =A; —J, 
a; +1—j—aA;, where a= -, as) 
and = %) together form the set (0,- - -,4—1); so that 


8 k-1 k 
4; (a; +1) = p(p + 1) 2 1) 


k 


$+1 


Hence 
k 


8 k 
+1) +1)} (Aj (As +1) — 2 p(p—1) 


Aj (Aj — 27 + 1) 


-M-» 


80 that 
k 
Xr (2) xa(1) = (Aj — 27 + 1) n(n —1) 
which is Hund’s formula. 


The analysis of the reducible representations A(A,,- - -,Ax) into irre- 
ducible components. The characteristic of Ax) is (8) 
whilst that of the irreducible representation *,Ax) is 

3 


= 


470 F. D. MURNAGHAN. 


| gu(s) | 
| y + gy, (8) | 
the problem confronting us is that of writing q),(s)- °° q,,(8) as a linear 


combination, with positive or zero integral coefficients, of the various simple 
characteristics {A}(s). When & = 2 the solution is trivially evident: 


| 


(8) 
Gro-2($) Gro-1 (8) | 


duals) | 
( | 
qr, (8) Gr.(8) Gr-1(8) Gr.(8) | | 


so that 


(18) A(Ay, Ac) = D(Aa, Av) + D(A, + 1, A2 — 1) 
+ D(A, + 2,A2—2) - -+ D(n). 


For k > 2 the problem may be solved as follows (we illustrate by considering 
the case k= 3). Let 2; be an operator whose effect is to replace q),(s) by 
Grj+1 (8): (8) = (8) =1, 2,3). Then the determinant 


| lS) | 
Gro-1($) (8) 
| qr.-2 (8) (8) (8) 


{A} (8) = 


may be expressed as the result of operating with 


>, | 

| 1 | 

upon the simple product q),(8)q).-1(8)),-2(8) and since the operators 2; 

operate on different symbols, x; operating on q),, they are commutative so that 

we may apply the ordinary rules of commutative algebra. Thus 


{A} (8) = — 2) (%3 — 21) — qn, (8) Yro-1 (8) Yrg-2 (8) 
and so 
(8) Gro-1 (8) (8) == — (43 — 21) (43 — {A} (8). 
We write now £; = 2; (7 = 1, 2,3), so that €; operates on A; so as to decrease 
it by unity: €j)q¢,, = qaj-1- Then 


(v3 — €,(1 — 


an¢ 


Th 


an 


jus 


op 


Fo 


| | 
| 
anc 
nu 
an 
wh 
D( 
(A 
(j 
we 
th 
1 
A( 
Ay 
{4 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 471 


and so 


(1 — &.%,)*(1 — (1 — *{A} (s). 
The product 


(1 — (1 — = 1+ + +° 


and the series may be stopped at p),(@1, Y2)é;** since each {A} with a negative 
number at the end vanishes. Then this must be multiplied by 


and this series may be stopped, for each {A’}, at py-y11(21)€2*? since each {A} 
whose next to last member < —1 vanishes. It is clear then that A(A) contains 
D(X) once and contains no D(X’) for which (A) > (X’) ; it being understood that 
(A) > (A’) when the first non-vanishing member of the set Aj; — A’; is positive 
(j7=1,2,:--). The argument is evidently perfectly general; thus for k = 4 
we first operate with 1+ pi + 2, +--+ on {A}; 
then follow this by 1 + and finally by 
1+ + +° °°. 

The following example will sufficiently illustrate the method: consider 
A(4, 27). Applying 1+ pi(a1, + to {4,2,2} we obtain 


(4, 2°} + (5,2, 1) + (4,3, 1} + {6,2} + {5,3} + (4, 4}. 
Applying 1+ pi(x,)é +--+ to each of these we obtain in turn 
{4,2,2} + (6, 0,2} + {7,—1,2}; {5,2,1} + (6, 1°} + (8,—1,1}; 

(4, 3, 1} + {5, 2, 1} + {6, 12} + {8,—1, 1}; {6,2} + {7,1} + (8}; 
(5, 3} + {6,2} + {7,1} + {8}; {4,4} + {5,3} + {6,2} + (7, 1} + (8} 
and adding up we find 


A(4, 2?) = D(4, 2°) + D(4,3, 1) + D(42) + 2D(5, 2, 1,) + 2D(5, 3) 
+ D(6, 12) + 3D(6,2) + 2D(7,1) + D(8). 
When there are but three elements (A, As, Az) in the partition the theory 
just given leads to the following convenient formula. Denoting by 12 the 
operation 1 + 2,é, + 7,°&.2 + - - - we have to apply to (Ay, As, Az) the operator 


12 + (a2 + 2a, (a2? + + 12) 
+ (42° + 22,227 + 32,72, + 2,3 - 


For example let us consider the analysis of A (42, 2) ; the application of 12 gives 


472 F. D. MURNAGHAN. 


{4’, 2} {5, 3, 2} + {6, {7, 1, 2} {8, 0, 2} {9,— 1, 2} 
= {4°, 2} + {5, 3, 2} + {6, 2°} — {8, 1°} — {9,1}; 
ylelds {4, 5,1} —0; 22,° 12-é, yields 
2[{5, 4,1} + (6, 3,1} + (7,2, 1} + (8, 12} — (10}] 
+ 22,22) yields 
{4, 6} + 2{5°} = {5°} 
whilst, finally, 32,2 -12-€ yields 
3[{6, 4} + {7,3} + {8,2} + {9,1} + (10}]. 
Combining these results we obtain 
A(4*, 2) = D(10) + 2D(9,1) + 3D(8, 2) + D(8, 17) 
+ 3D(7,3) + 2D(7, 2,1) +3D(6,4) + 2D(6, 3,1) 
+ D(6, 22) + D(5*) + 2D(8, 4, 1) + D(5,3,2) + D(4%,2). 
Whilst the formula just given may be regarded as a complete theoretical 
solution of the problem of analysing the reducible representation A(A) into its 
irreducible components it becomes very tedious when k, the number of elements 
in the partition (A), = 4. Fortunately the necessary information, up to 


n = 11, is available in tables prepared by Kostka.*’ This writer was interested 
in the general question of symmetric functions and in the course of his investi- 


gation took up the question of expressing a product o),(%) - - -oy,(%) of the 
elementary symmetric functions of n variables (2:,° - *,2n) as a linear com- 


bination of determinants 


On, (2) Onysj-1 (4) 

Since om(%) = = Gm(S1, — 83, —S4,° it follows that 
such an analysis of the product o),(%)- - ‘o,,(%) furnishes the analysis of 
the product q,,(s)-*- -q,(s) as a linear combination of the simple char- 
acteristics {n,,- --,nj}; or, equivalently, of the reducible representations 
A(Ai,° ° *,Ax) as a linear combination of the irreducible representations 


1The fact that these tables, up to n = 8, were published in 1882 long before the 
representation theory of the symmetric group, and its applications, were dreamed of, 
recalls to mind the verse in Ecclesiastes: ‘Nothing under the sun is new, neither is 
any man able to say: Behold this is new: for it hath already gene before in the ages 
that were before us.” 


D(a 
tabl 
nun 
bein 
decr 
app 
inst 
incr 
squ 
usec 
Kos 
our 

assu 
han 
is a 

and 
char 
D(f 
taki 
tabl 
for 

A(1 
of L 
we | 
ot ¢ 
the 
unit 
the ¢ 
three 
mak 


and 


acter 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 473 
D(m,* **,”j). The paper numbered (18) in the list of references gives the 
tables for 2 8; that numbered (19) the table for and that 
numbered (20), which is inaccessible to us, the tables for 10 =n=11. In 
using these tables note that the A(A) appear at the bottom (the symbol K 
being used instead of A and the partition being indicated by a suffix, a non- 
decreasing rather than a non-increasing order being adopted: thus A(3?, 2) 
appears as K»,*); the D(A) appear on the right (the symbol C being used 
instead of D, and the partition being indicated by a suffix, the normal non- 
increasing order being used: thus D(4, 2?) appears as C42). The tables are 
square but the part above the main diagonal serves another purpose and is not 
used in the problem that concerns us. ‘The following method of deriving 
Kostka’s tables (or, more particularly, that part of them which is effective for 
our problem) was suggested by Littlewood and Richardson (18), on the 
assumption that the character table of the symmetric group in question is at 
hand. The product q,(s) q,,(s), which is the characteristic of Ax) 
is a linear combination, say, of the quantities = + + 
and hence, by (3), a linear combination, & Ca) x6) bp (8) of the simple 

(B) 


characteristics of the symmetric group on n letters. Hence the coefficient of 
D(B) in the analysis of A(A) is ¢cay’xg)‘* ; in other words it is obtained by 
taking the indicated linear combination of those columns of the character 
table which correspond to classes (7) for which s/ appears in the product 
This method is particularly suited to those partitions (A) 
for which many of the A; are unity. For instance if they are all unity 
= {qi (s) }” and so the coefficients in the analysis of 
A(1") are merely the characters of the unit class; in other words the coefficient 
of D(A) in the analysis of A(1") is the dimension of D(A); a fact of which 
we have independent knowledge since A(1”) is the regular representation, 
of dimension n!, of the symmetric group. For A(2,1"*) the product 
gy, (8) is + +2 and hence the coefficient of D(A) in 
the analysis of A(2,1"-*) is the mean of the characters x,,)"". yay" *” of the 
unit class and the transposition class, respectively. But it is clear that whilst 
the analysis of A(p, 1") is relatively simple by this method a partition with 
three or more parts = 2 leads to somewhat complicated calculations. Thus to 


make the analysis of A(3, 2*) we would have to evaluate 
= (81° + 38,82 + 283) (5,7 + 82) + 31(2!)? 


and then form the indicated combination of the many columns of the char- 
acter table (of the symmetric group on 9 letters) involved. We give below the 


474 F. D. MURNAGHAN. 


analysis of all A(A) for n=9 and in making the calculations found the 
following method the most convenient. It rests on a knowledge of the analysis 
of the product {A}{y»} of two simple characteristics which is given (for 
SA + 34=9) in the following section. Suppose we wish to analyse the 
reducible representation A(3, 2,1) of the symmetric group on six letters. We 
have qoqgi = {3} + {2,1} so that, since g,; = {3}, qsqeqi = {3} {3} + {3} {2, 1}. 
From the values given at the end of the next section we read off 


{3} {3} = {6} + {5,1} + {4,2} + {37}. 
{3} {2, 1} {5, 1} {4, 2} {4, 17} {3, 2, 1} 
so that 


939201 = {6} + 2{5, 1} + 2{4, 2} + {4, a} 4+ {37} “f- {3, 2, 1} 
or, equivalently, 
A(3, 2,1) —D(6) +2D(5,1) + 2D(4,2) + D(4, 12) + D(3*) + D(3, 2, 1). 


In the following tables the irreducible representations are‘ written across the 
top, the D being omitted in the interest of space, and the reducible representa- 
tions are written down the left. For convenience of printing, Table 8, n = 9, 
is turned around so that the bottom of the page is the left-hand side of the table 
and the left-hand side of the page the top of the table. As examples of how 
the tables are read we cite the following: 


n=2; A(1?) = D(2) + D(1?) 
n=3; A(2,1) = D(3) + D(2,1) 
n=4; A(2,1) = D(4) + 2D(3,1) + D(2?) + D(2,1) 
n=5; A(2,1) = D(5) + 2D(4,1) + 2D(3, 2) + D(8, 17) + D(2?, 1) 
n= 6; A(2*) = D(6) + 2D(5,1) + 3D(4, 2) 
+ D(4,1*) + D(3?) + 2D(3, 2,1) -+ D(2*) 


The numbers to the right of the main diagonal are all zero and are not written 
in. It may be observed that there is considerable duplication in the tables, 
the coefficients of the analysis of A(A) being independent of n for the earlier 
partitions. Thus the coefficients in the analysis of the first twelve partitions 
of 9, from (9) to (5, 1*) inclusive, are the same as those in the analyses of the 
first twelve partitions of 8, from (8) to (4, 1*) inclusive. The table for n = 10 
coincides with that for n = 9 for the first 19 partitions (from (10) to (5, 1°) 
inclusive) it being understood that the column under (57) is filled with 1’s 
(from the partition (5*) to (5, 1°) ) ; the correspondent to this column, namely 
(4,5), being absent from the table for n 9. In completing the table for 
n = 10 it is convenient, in dealing with a four or five element partition which 


res 
tab 


incl 


He 
anc 
duc 
{ 
A(4 
3. n 
A(: 
Al 
A(2 
Al 
5. 
A( 
A 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 475 


ends in a 1 or 2, to use the corresponding partition in the table for n = 9 or 8, 
respectively. E.g. to obtain the analysis of A(4,3,2,1) we use, from the 
table for n = 9, the result 


— {9} + 2(8, 1} + 37, 2} + 13} + 3{6, 3} + 2{6, 2, 1) 
2(5, 4} + 2{5, 3, 1} + (5, 2} + (42, 1} + (4, 3, 2). 
Hence 


= {9} 2{8, 1} {1} 


and from the theorem concerning the analysis of the direct product of irre- 
ducible representations, given in the following section, we have 


(9}{1} = {10} + {9,1}; 1} {1} = (9, 1} + (8, 2} (8, 17} ete., 
so that on collecting we obtain 


A(4, 3, 2,1) = D(10) + 3D(9, 1) + 5D(8, 2) + 3D(8, 12) + 6D(7, 3) 
+ 6D(7%, 2,1) + D(%, 18) + 5D(6, 4) + 7D(6, 3, 1) 
+ 3D(6, 27) + 2D(6, 2, 12) + 2D(5?) + 5D(5, 4,1) 
+ 4D(5, 3,2) + 2D(5, 3, 12) + D(5, 22,1) + 2D(4?, 2) 
+ D(42, 12) + D(4, 3?) + D(4, 3, 2, 1). 


Tables furnishing the analysis of A(X) for values of n from 2 to 9 
inclusive. 


1.n=2. (2) (1?) 2.n=3. (3) (2,1) (13) 
(2)| | A(3)/ 1 | 
A(i2)) 1 1 A(2,1)} 1 1 | 
A(18)} 1 
8. n=4, (4) (3,1) (22) (2,12) (14) 4. n=5. (5) (4,1) (3,2) (3,12) (22,1) (2,18) (15) 
A(4)} 1 | | | A(5)| 1 
A(3,1)} 1] 1 | | A(4,1)/ 1 |. 
A(z’); 1} 1] 1) | | A(3,2)| 1 1 1 
A(2,12)) 2] 1] 1] A(3,12)} 1 2 1 1 
A(14)| 1] 3] 2] 3] 1] A(2?,1)| 1 2 2 1 1 
A(2,13)) 1 | 8 | 3 3 2 1 
a(is)} 1 | 4 | 5 | 6 | 5 | 4] 1 | 


5. n=6. (6) (5,1) (4,2) (4,12) (3%) (3,2,1) (3,18) (28) (27,12) (2,14) (18) 
| | | 


A(6)| 1 | | 
A(5,1)| 1 1 | | | | | 
A(4,2)| 1 1 | | | | 
A(4,12)| 1 2 1 1 | | | | 
A(3?)| 1 0 | | | 
A(3,2,1)| 1 2 2 1 | 
A(3,13), 1 3 3 3 1 2 1 | | 
A(23)| 1 2 3 I 2 0 1 | | 
A(22,12)| 1 3 4 3 2 4 aaa | 
A(2,14), 1 4 6 6 3 8 4 2 3 1 
A(18)| 5 9 | 10 5 | 16 | 10 5 1 | 


2) 


|(6,1) 
(5, 


| 
| 


CRW 


C20 


N 

— 


cou 


POOR 


F. D. MURNAGHAN. 


3,2, 12) 


(4,13) 
|(32,1) 
(3,22) 
(3,14) 
(23,1) 


— 


TOO 


— 


= 


Wwe 


(4,2,12) 
(4,14) 
(32,2) 

|(32,12) 
(3,22,1) 
(2,18) 
(18) 


WN 


o - 

q 


t 


bo 
~ 


TH 


ND GO 


6} 1) 5 
1; 3) 
4) 15! 
1] 3} 7 
14, 4] 13 


me 


24) 10, 
40} 20 


| 40} ¢ 


28) 64| 351 14| 701 90. 


182,19 


209 DD 


DW 


476 
(7)| | | 
A(5,1?) | | | 
A(4,3)| | | 
A(4,2,1)| | | | | 
A(4, 13)| 1 | | | | | 
A(3?,1)| | | oO; 1] | | | ; | 
A(3,14)| | 4] 3 | 
A(23,1)} | 1 3 | | 
A(2?2,13)| | 4/1 i] | 
A(2,15)| 1 110/11} 1 
A(17)| | 1 20 | 21 | 2 
a(s)| 1; | = 
A(6,2)} 1] | | | | 
A(5,3)| 1) J) 1 | | 
A(5,2,1)| 1) 
A(5,13) 3} 3} 1) 2] 
1) 1} 1) 
A(4,3,1)| 2) 1] | 
A(4,22)| 3} 1) 
A(4,14)| 6} 6| 4 | 
A(32,2)| 1) 3; 1) 3 | O| O 
A(32,12)} 1) 4 3| 4| 2. 
A(3,22,1)} 5} 2) O 1 
A(3,2,13)} 1 7| 7| 1 2 1 
A(3,15)| 10} 10} 10) 10} 15) 5 5} 4) 1 
A(24)| 6| 3) 6 6} 3) O 1 
A(23,12)| s| 6| 9} 9) 1 6 2) 0 1 1 
A(2?,14)| 1) 11) 10} 13) 16) 21) 5) 1 13) 8 1) 2} 3) 1 
A(2,16)| 15) — -20| 30) 241 6) 5] 9 1 
A(18)| 20! 42) 70! 64) 21) 14) 28) 20) 7_! 


=) 
= 
> 
A 
= 
jee 
M 
A 


ON THE R 


F 


FI | L | ITT 16 


|FE 
ra! ral 
08 


Og 


SG 


O€ 
LT 
OL 


\9 


Gl 


IG 


6 


G 


9 


|G 
IST 


OGI 
iSOL OL 


OF 


66 
09 
9¢ 
IG 
CF 


LG 


16 


fe 3G 
61 
6 

OL 


Cc 


OL 
SG 
LI 
OF 
¥G 


mst 


~ 
— 


SF 
re 


© OD 10 


CO 


(6LV 
(cl 12a) V 
(el ea) V 
(1'sa)V 
(gl 
GI 
V 
(s)V 
)V 
(1'2‘9)V 
(¢°9)V 
‘2)V 
(Z'L)V 
(1‘8)V 


| 
(eT ‘sZ)| 
(152) 
(21 
| 
| 
(cI 
6 
| 
€ 
(1‘zh) Alto 
DOO 210 1D 19 ' 
(51'S) 
| 
| 
: : 
(s1‘9) | 
| 
| 
6 
a 
ae 
<1 


478 F. D. MURNAGHAN. 


The direct product of irreducible representations. We consider two sets 
of n and m letters, neither set having a common letter so that the number of 
distinct letters in the two sets taken together is n + m. If (A) is an arbitrary 
partition of m and (») an arbitrary partition of m and D(A), D(u) the 
attached irreducible representations of the symmetric groups on n and m 
letters, respectively, the direct product D(A): D(z) is a representation, in 
general reducible, of the symmetric group on n + m letters whose characteristic 
is the product {A}{»} of the characteristics of D(A) and D(u). Our problem 
is the analysis of D(A) - D() into its irreducible components. For the sake 
of brevity we shall omit the symbols D and write (A): (») for D(A) - D(x). 
If (v) is a typical partition of n + m a relation (A): (u) = > Cv) (v) implies 


{A} {u} = cv {v} since is the characteristic of (A) In order 


to arrive at a solution of our problem we first remark that the fundamental 
recurrence formula (13) may be generalised as follows. Let é; be an operator 
which decreases the j-th member A; of the partition 


(A) (Ai, » Ax) (Ai; ° An) 
by unity (j7=—1,---,n). Then 


so that €j?{A1,° if 7 > k for then - -,A;— An} ends 
in a negative integer after the zeros at the end have been discarded. On writing 


Sp=&?+-- ++ we have 


so that our formula (13) may be written in the form 


and we may say that we have stripped off one cycle of p letters from (2). 
Following this by stripping from (@’) a cycle of q letters we obtain 


{A1, de }a SqSp{A1, dx} ca") 


where (@”) is the class, of the symmetric group on n — p—q letters, which 
contains one less cycle on p letters and one less cycle on q letters than the 
class (a) of the symmetric group on n letters. More generally we may strip 


off B 
tion 
notat 
and 
(B) 

(B) 


By 


(19) 


be an 
partit 


and s 


the s1 


Xia) 


to (8) 
acters 


(20) 


wher 
be an 
Is a} 

( 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 479 


off 8; unary cycles, 8. binary cycles ete.; to write the corresponding generalisa- 
tion of the recurrence relation (13) it is a little more convenient to change the 
notation slightly so that n is replaced by n + m. Then («) is a class of n + m 
and we strip off 8, unary cycles, B2 binary cycles, - - - By n-ary cycles where 
(B) is a class of n. Denoting by (y) the class of m which is such that 
(B) + (y) = (a) our generalised recurrence relation appears in the form 


By means of (3) this may be written in the form , 
(19) {A}iay = x9) (S) 


where the summation is over the partitions (6) of m and (B), (y) are any 
classes of n and m, respectively, whose sum is the class (a) of n + m. 
Let now 


1 
(6) 


be any simple characteristic of the symmetric group on n letters (so that (e) 
isa partition of n) and, similarly, let 


1 
(T) 


be any simple characteristic of the symmetric group on m letters, (v) being a 
partition of m; their product is 


1 
> N (8) M Xv) 
M (5) (7) 


1 
$e) (8) dr) (8) | 
and since (8) + (r) is a partition of n + m we have, from (3), 


p> (s) 
(a) 


the summation being over all partitions («) of n+ m. On substituting for 
xa) its value {69(S)%a}‘” from (19) and summing with respect 
0 


to (8) we obtain, in view of the orthogonality relations (1) between the char- 
acters of the symmetric group on n letters, 


1 
(20) $ 6) (8) (8) (S) hia}! (8) 


480 F. D. MURNAGHAN. 


Now ¢,¢)(S) is a symmetric function, of degree n, with integral coefficients, of 


the n + m operators and so is of the form >) where (7) isa 
(1) 
partition of-m and [&,."- - -é,"] denotes the symmetric function of the 
n-+ m operators (:,° whose leading term is The 
result of operating with - -&™ on Za) Is 
and we may denote the result of operating with [€,7- - -&™] on xa) by 


%a)-(7)}- The summation with respect to 7 yields zero, owing to the ortho- 
gonality relations between the characters of the symmetric group on m letters, 
save when (a) is such that one member of [ (a) — (7)] is the same as (v), 
with the same convention as before regarding the rearrangement of disordered 
partitions; in which case the coefficient of ¢:a)(s) is ¢:r).. The simplest ex- 
amples may serve to make the theory clear; thus let («) = (1) so that we wish 


to analyse {1}{v1,- the only ¢:ay(s) which appear in this product are 
those for which («) is obtained from (v:,- - -, vx, 0) by adding unity to one of 


its members and these all occur with coefficient unity ; for 


$:1)(S) == = & + & + * 
Hence 
E.g., {1} {4, 2°} — (5, 29} + (4, 3,2} + (4, 2, 3} (4, 2% 1} 

— (5,2) + (4,3,2) + (4,281) 
or, equivalently (1) - (4, 2?) = (5, 27) + (4, 3,2) + (4, 27,1). The next sim- 
plest example is furnished by {2}{v,,- -, vx}; here 


$2(S) = p2(§) = + 


and so 
+ - 
the terms {v; + 1,- -, vx, 0,1} vanishing and the terms 0, 2} and 
{11,° * vx, 1,1} cancelling one another. 


» {2} {3, 2} = (5, 2} + (3, 4} + (8, 2°} + {4,3} + (4, 2,1} + (3, 1} 
= {5,2} + {4,3} + {4, 2,1} + {8,1} + {3, 27}. 


Sin 


we 


(th 


E.§ 


Sin 


| 
E 
It 
fac 
We 
we 
ent 
(si 
{4\ 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 481 


Similarly since 
b3(S) = ps($) = + + 
we have 
+n 
+ {v, +1,-- 
+ in + + + 1,- - 


(the remaining terms vanishing or cancelling each other). 


{3} {2, 17} = (5, 17} + {2, 4,1} + {2, 1, 4} + (2, 12, 3} 
+ {4, 2, 1} + {3?, 1} + {4,1, 2} 
+ {3, 1,3} + {2, 3, 2} + (22, 3} 


+ (8,27) + + 12,1) + 1) 


Since = 02(8) = we have 


{17} {3, 2, 1} = {4, 3,1} + {4, 2? ia th 
+ {4, 2, 12} + (37, 12} +- (3, 27, 1} + {3, 2, 1°}. 


It is clear that whilst this method is entirely practicable when one of the 
factors is {2}, {3}, {1°}, {1%} it rapidly becomes very tedious in other cases. 
We give below tables of all direct products (A) - (w) for which n + m < 9 and 
we found the following method, which is sufficiently illustrated by an example, 
entirely convenient. Suppose we wish {3, 1} {27} ; we write {3, 1} = {3} {1} — {4} 


(since a = 9291 — Ys) and we see that the calculation rests on that of 
0 
(4)(2*}. But (4, 22) — {4} {2°} — (1}{5, 2} + (5, 3} since 
G4 Ye 
q2 Ys Yo Ys Yo 


} 
{5, {4, 2. 1} {4, 1°} + (3, 1°}. 
| 


482 F. D. MURNAGHAN. 


Hence 
{22} — {4, 2} + {1}{5, 2} — (5, 3} 
{6,2} + {5, 2,1} + (4, 22}. 
Similarly 
{3} {2°} = {5,2} + {4, 2, 1} + {3, 27} 
and so 


{1} {3} {2}? = [(6, 2} + (5, 3} + (5,2, 1}] 
+ [{5, 2,1} + (4,3, 1} + (4, 2°} + (4,2, 
+ [{4, 2°} + (3%, 2} + (3,2, 3} + (3, 2%, 1}] 
— {6,2} + {5,3} + 2(5, 2,1} + (4,3, 1} + 2(4, 27} + (42,14 
+ {37,2} + (3, 2,1}. 


Hence 


{3, 1} {27} =" {5, 3} + {5, 2, 1} : 
+ {4, 3,1} + {4, 27} + {4, 2, 1°} + {38% 2} + {8, 2%, 1}. 


In the following tables the irreducible representations are written across the 
top and the desired direct products are indicated down the left. As examples 
of how the tables are read we cite the following: 


n+ m—=3; (1) = (3) + (2,1) 
n-+m=—=4; (2,1) (1) = (3,1) + + (2,12) 
n-+m=—=5; (3)- (12) = (4,1) + (3, 1’) 
n+m=—=6; (2,1) (2,1) = (4,2) + (4,1°) + (32) 
+ 2(3, 2,1) + (2%) + (3, 18) + (22, 1°). 


Since a change of sign of (s2, s4,- - -) sends {A} into the associated character- 
istic {u} it is clear that {u}{p’} is obtained from {A}{X’} by merely taking the 
associated characteristics or representations ; e.g. from {3} {17} = {4,1} + {3,17} 
we read {1°}{2} — {2, 1°} + {3,1°}. We use this trivially evident fact to 
materially cut down the size of the tables (without causing trouble to the user) 
by writing {u}{y’} on the right side of the table directly opposite {A}{\’}— 
where {»} and {A} are associated simple characteristics of the symmetric group 
on n letters whilst {u’} and {d’} are associated simple characteristics of the 
symmetric group on m letters. It being understood that when'we pick up our 
direct product on the right-hand side of the table we find the irreducible 
representations of the symmetric group on n + m letters which occur in the 


anal 
up t 
tion. 
prin 
pag 


the 


valt 


q 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 483 


analysis of the direct product at the bottom of the table; whilst when we pick 
up the direct product on the left-hand side of the table we find the representa- 
tions which occur in its analysis at the top of the table. For convenience of 
printing, the tables for n + m = 8, 9 have been turned so that the top of the 
page is the right-hand side of the table and the left-hand side of the page is 


the top of the table. 


Tables furnishing the analysis of the direct product (A) - (A’) for all 
values of n + m from 2 to 9 inclusive. 


n+m=2. 
(1).(1) = (2) + (1?) 
2. n+m=3. (3) (2,1) (13) 
1 [| 1 | }(12).(1) 
(13) (2,1) (3) 


3. n+m=4. (4) (3,1) (14) 


(13). (1) 
2.1).(1) ae 1 | (2,1).(1) 
(2).(2)| 1 1 | (1?).(1) 
(2).(12)| | (12).(2) 


(14) (2,12) (22) (3,1) (4) 


4. ntm=5. (5) (4,1) (3,2) (3,12) (22,1) (2,13) (15) 
1) 1 | | | | | ((14).(1) 
(3,1).(1) 14 7 |(2,12).(1) 
(2?).(1) | 1 | 4 | (22).(1) 
(3).(2)} 1 1 1 | | ((13).(12) 
(2,1).(2) i if | (2,1).(12) 
(13). (2) 1 1 (3). (12) 


484 F. D. MURNAGHAN. 


5. nt+m=6. (6) (5,1) (4,2) (4,12) (3)? (3,2,1) (8,18) (23) (22,12) (2,14) (18) 


(5).(1){ 1 1 | | | | | | | (15). (1 
(3,2).(1) 1 | (23 1). 
(3,17).(1) 1 | | 1 1 (3,12). 
(3,1).(2)) Vote (2,12). 
(22),(2)| | 1 |(22).( 
(2,1?).(2)| | 1 1 | \(3,1). 
(3).(3)} 1 | 1 1 | | 


(18) (2,14) (22,12) (3,18) (28) (3,2,1) (412) (3%) (42) (51) 


(4,2).(1) | |(2?,12).(1) 
(4,12).(1)| 1 1 | 1 | | | | | |(3,13).(1) 
(3,2,1).1)) | 1 | | |(3,2,1).(1) 
(5).(2)| 1 | 1 | 1 }(15).(12) 
(41-2) | | | (2,13).(12) 
(3,2).(2)} | | 1 |(22,1).(12) 
(3,12).(2)} | 1 | 1 | (3,12).(12) 
(22,1).(2)} | | 1 1} 1 1 |(3,2).(12) 
(2,13).(2)} || 1 | 1| 1 1 (4,1).(12) 
(15).(2); | | 1 1 (5). (12) 
(4).(3)) 1} 1] 1 1 | | (14). (13) 
(3,1).(3)| 1 | (2,12). (13) 
(22).(3)) 1} | (22). (13) 
(2,12).(3)) 1 (3,1). (13) 
(14).(3)| | | | | 1 (4).(13) 
(4).(2,1)} | 1 1 | (14). (2,1) 
(3,1).(2,1)| | | (2,12).(2,1) 
(22).(2,1)| 1 (22).(2,1) 
S~ 


n+m=8 


7. 


| 
(1) 
(1) 
(1) 
(3] 
(12) 
2) & 
1?) 
) 
3) 
1) (el 
) 
2, 1) 
(oI 
4 
(ch 
(¢°¢ 
(29 
TL 
(8 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 485 


aN fod ~ an ~ AN an an as an 


(91'S | (12) 


(¢°g) 


— | (eh) 


(219) 


) 
) 
) 
) 


(st) 


F. D. MURNAGHAN. 


(e12)| 
sd) 
(152 3), 
(91 


— 


(1'%'9) 


(8) 
9) 
(21‘S'S) 


- 
(1'¢'¢) 


(1°Z‘9) 
(¢‘9) 


(Z‘2) 


) | 
6 
+ 
— 


1 
(152) 


q 
486 
ii 
i 
$e 
OD Ores NNO 
(61) (6) 
‘ 
(Z‘L) 
(¢°9) 
ont 
q 
f ( 
( 
= 
4] | 
a | 
= (I ¢) 
( 
ll 
i 
@) 
a 
] 


ON THE REPRESENTATIONS OF THE SYMMETRIC GROUP. 487% 


= = 


(cI > 
(I 
(st‘9) 


INSTIT 


~ = AN 
~ 
— 
JOHNS HopKINS UNIVERSITY, 


UTE FOR ADVANCED STUDY. 


— 


F. D. MURNAGHAN. 


REFERENCES. 


G. Frobenius, “ Uber Gruppencharaktere,” Berliner Berichte (1896), pp. 985-1021. 
, ‘Uber Relationen zwischen den Charakteren einer Gruppe und denen ihrer 
Untergruppen,” Berliner Berichte (1898), pp. 501-515. 


3. ———, “Uber die Composition der Charaktere einer Gruppe,” Berliner Berichte 
(1899), pp. 330-339. 

4. ———,, “ Uber die Charaktere der symmetrischen Gruppe,” Berliner Berichte (1900), 
pp. 516-534. 

5. ———., “Uber die charakterischen Einheiten der symmetrischen Gruppe,” Berliner 
Berichte (1903), pp. 328-358. 

6. ———, “Uber die Charaktere der mehrfach transitiven Gruppen,” Berliner 


Berichte (1904), pp. 528-571. 
7. I. Schur, “Uber einer Klasse von Matrizen,” Dissertation, Berlin (1901). 


8. , “ Neue Begriindung der Theorie der Gruppencharaktere,” Berliner Berichte 
(1905), pp. 406-432. 

9. ———,, “Uber die Darstellung der symmetrischen Gruppe durch lineare homogene 
Substitutionen,” Berliner Berichte (1908), pp. 664-678. 

10. ———, “Uber die rationalen Darstellungen der allgemeinen linearen Gruppe,” Ber- 
liner Berichte (1927), pp. 58-75. 

11. ———,, “Die algebraischen Grundlagen der Darstellungstheorie der Gruppen,” 


Zurich Lectures (1936), pp. 1-74. 

12. A. Young, “On quantitative substitutional analysis,” Part 1, Proceedings of the 
London Mathematical Society (1), 32 (1901), pp. 384-404; Part II, ibid. 
(1) 34 (1902), pp. 202-208; Part III, ibid. (2), 28 (1928), pp. 255-292; 
Parts IV, V, ibid. (2), 31 (1930), pp. 253-272, 273-288; Part VI, ibid. (2), 
34 (1932), pp. 196-230; Part VII, ibid. (2), 36 (1934), pp. 304-368; 
Part VIII, ibid. (2), 37 (1934), pp. 441-495. 

13. D. E. Littlewood and A. R. Richardson, “Group characters and algebra,” Philo- 
‘sophical Transactions of the Royal Society of London (A), 233 (1934), 
pp. 99-141. 

14, ———, “ Immanants of some special matrices,” Quarterly Journal of Mathematics, 
Oxford Ser. 5 (1934), pp. 269-282. 

15. D. E. Littlewood, “Group characters and the structure of groups,” Proceedings 
of the London Mathematical Society (2), 39 (1935), pp. 150-199. 

16. M. Zia-ud-Din, “The characters of the symmetric group of order 11! ,” Proceedings 
of the London Mathematical Society (2), 39 (1935), pp. 200-204. 

17. ———, “ The characters of the symmetric group of degrees 12 and 13,” Proceedings 
of the London Mathematical Society (2), 42 (1937), pp. 340-355. 

18. C. Kostka, “ itiber den Zusammenhang zwischen einigen Formen von symmetrischen 
Funktionen,” Journal fiir die reine und ang. Mathematik (Crelle), 93 
(1882), pp. 89-123. ; 

, “ Tafeln und Formeln fiir symmetrische Funktionen,” Jahrbuch der Deutschen 
math. Ver., 16 (1907), pp. 429-451. 

20. ———, “ Tafeln fiir symmetrische Funktionen bis zur elften Dimension mit kurzen 

Erlaiiterungen,” Prog. (5), kgl. Gym. u. Realgymn. Insterbiirg (1908). 

21. W. Specht, “Die irreduziblen Darstellungen der symmetrischen Gruppe,” Mathe 

matische Zeitschrift, 39 (1935), pp. 696-711. 
22.. E. Wigner, “On the consequences of the symmetry of the nuclear Hamiltonian on 
the spectroscopy of nuclei,” Phys. Rev., 2nd Ser. 51 (1937), pp. 106-119. 
23. ———, “On the structure of nuclei beyond oxygen,” Phys. Rev., 2nd Ser. 51 
(1937), pp. 947-958. 


19. 


me 


int 


i 
4. 
2. 
of 
7 
| re} 
q nal 
(1. 
(1. 
Th 
of 1 
Duf 
Edi 
| Mat 
p.9 
(f 
taci 
4 app 
of f 
by i 
4 ordi 
; tion 
ship 


THE HEAVISIDE OPERATIONAL CALCULUS.* 


By D. G. Bourein and R. J. DUFFIN. 


In its primary form the Heaviside calculus is concerned with the inter- 
pretation and application of functions of the operator p where p takes 2” into 
nz""!, Various representations are,’ of course, possible. Heaviside’s develop- 
ments as well as the closely related work of Volterra * on permutable functions 
of the closed cycle, depend on series expansion of F'(p) and term by term 
interpretation according to the association p’ = «’1/T(v). The particular 
representation used in this paper is that of the Laplace-Mellin integrals, 
namely, F(p) = f(x) * stands for 


(1.1) F(p) (x) dx 
f(x+0) +f(e@—0) 4 
9 


1 


This paper may be considered a study of some special results in the theory 
of these integrals. 

The specific concerns of this work include the validation of the asymptotic 
expansion theorem of Heaviside for a wide class of functions and two theorems 


* This joint work was completed in all essentials while the junior author, R. J. 
Duffin, was in the physics department of the University of Illinois. Received by the 
Editors January 9, 1936; Revised October 5, 1936, and March 6, 1937. 

*H. Jeffreys, “ Operational methods,” Cambridge Tracts; T. C. Fry, Annals of 
Mathematics, vol. 34 (1921), p. 184; N. Wiener, Mathematische Annalen, vol. 95 (1926), 
p. 95; P. Levy, Bulletin Mathematique de France, vol. 1 (1926), p. 174. 

? Volterra and Peres, Lecons sur la Composition. 

* The notation is due to B. van der Pol, Philosophical Magazine, vol. 8 (1929), p. 801. 

*In this article wherever f(x) stands alone on one side of an equation, the meaning 
+0) + f(#—0))/2 is to be ascribed to it. 

5 Since there is no finite natural boundary for F'(p) in the present work, it is 
tacitly assumed that analytic continuation is used in the cut plane. For operational 
application such continuation is usually carried out by the principle of “ permanence 
of form.” 

Operational interpretations may, of course, be developed for specific function classes 
by introducing simple closed circuits or Hankel or Pochhammer contours instead of the 
ordinate 72 (p) =c. Such interpretations, in the writers’ opinion, are not strictly 
speaking of “ Heaviside ” type, in general, since the property of vanishing of the func- 
tions, thus defined, for negative real values is given up. Moreover the intimate relation- 
ship with the Fourier integral stressed in this paper, is lost. 


489 


: 


490 D. G. BOURGIN AND R. J. DUFFIN. 


which may be used to establish many of the formal identities in the literature 
of the Heaviside theory as well as certain extensions of that discipline for 
instance to a conjugate Heaviside calculus. The most interesting contributions 
are those connected with the development of certain reciprocal kernel relation- 
ships and the solutions of the Laplace integral equation. 

Closely allied to the study of Eqs. 1.1, 1. 2 is, in a sense, that of Fourier 
transforms; one writes e-°*f(a) in place of the usual f(x) and e**F(p) instead 
of F(p), viz. 


(1. 11) F(c + it) 


(1. 21) (x) fF + it)ettedt, 


These are precisely the integrals of Eqs. 1.1 and 1. 2. 

In general, Fourier integral theorems imply results for the Mellin Trans- 
formation. Evidently, then, conditions such as the bounded variation of f(z) 
in the neighborhood of a point and the absolute integrability of e-°*f(a) over 
the axis of reals,® are sufficient for the inversion formulae, Eq. 1.11 and Eq. 


1. 21, provided that the Cauchy principal value’ L,_,.. i , is understood 
in evaluating the infinite integrals. 

In order to extend the set of operators, f(x) in Eqs. 1.1 and 1.2 is 
assumed to vanish on the negative real axis and all “ permissible ” operators 
are such as to leave this function class invariant, (i.e. the lower limit of the 
integral of Eq. 1.1 may always be taken as 0). This rules out, for instance, 


*E. W. Hobson, Functions of a Real Variable, Cambridge Press, 2nd edition, vol. 2, 
p- 721. Throughout this paper the bounded variation condition introduced to guarantee 
the limit may be generally replaced by any other of the Fourier integral or Fourier 
Series conditions for convergence at a point. 

7 Some such condition is essential for the integral (with real f(@) ) 


oO 
f f eip(a-t)f dadp 
-0O -0O 
may be written 


f f(x) cos p(a—t)dadp +% f f(a#) sin p(a#— t)dadp. 
-00 oJ -00 -00 J 


The second integral on the right requires much stronger conditions than does the first 
for convergence. P. Pi Collega, Mathematische Zeitschrift, vol. 40 (1935), p. 349. How- 
ever, the integrand is easily seen to be an odd function in p so that the Cauchy limit 
on p exists and is 0 for functions satisfying the condition for existence of the Fourier 
cosine integral. 


th 


fo 


th 


su 


|_| 
th 
i at 
07 
a 
CO 
0) 
| th 
‘ T 
q 
i 
| 
ly 


THE HEAVISIDE OPERATIONAL CALCULUS. 491 


the operator e"” for e”f(x) is f(a +h) formally, which is non-zero in general 
for (h > 0) —h<x< 0. However, e”, h > 0 is a permissible operator. 
We proceed now to establish in a direct fashion, an asymptotic expansion 


theorem of considerably greater content and precision than Heaviside’s “ rules.” 
THEOREM 1. (a) F(p) ts analytic except for poles of order py,° * +, pn 

at pn; F(p) has essential singularities at €,,- -,e:. In each suffi- 
ciently small deleted neighborhood F(p) is analytic and expansible in a Laurent 

oO 
series Arn(p— ex)" + Ben(p— ex)". EF (p) has branch points of finite 
order at +, 0m in the neighborhoods of which 

F(p) = (p—bi)*Wi(p— bi) + oi(p—bi), 

the power series 

yi(p— bi) = Zain(p—bi)", — bi) = % Cin(p — Di)” 
converge for |p—bi|Sri>0. (b) F(p)| uniformly for 
n/2 Sarg p—cS3r/2. If b ts the abscissa of the singularity furthest right 


the ordering is such that ®@(p;) = ®& (bi) = R (ex) =b for the first | values 
of j, the first q values of 1, and the first f values of k. 


Under these hypotheses on F'(p) 


l 
(2) F(p) =f(x) ~ Res. F(p) (p — /T 
j=l 


q f 
+ (a, —n) + Buna" /T(n + 1). 
1 0 


i=1 
For z > 0 we use the closed contour made up of the ordinate ®@ (p)—c > b, 
the left hand infinite semi-circle on this ordinate together with the necessary 
non-intersecting branch cuts and small circles about the poles. By Cauchy’s 


Theorem 


(2.1) e??F'(p)dp— > (resides + integrals about essential singu- 
c-i 
larities + integrals around branch cuts + 


integrals on the semi-circle) = 0. 


The residues evidently contribute just the terms involved in the first 
summation in the theorem. 

Surround the essential singularities by non-intersecting circles of radii sz 
lying within the regions of convergence of the Laurent series. Because of the 


492 D. G. BOURGIN AND R. J. DUFFIN. 


uniform convergence of these series on the circles term by term integration is 
justified. The inverse powers alone contribute to f(z) and their effect is 
epitomized in the third group of terms of Eq. 2. Evidently 


L Brn = 0(Sx). 


The function > Binv"/T'(n + 1) is then an entire function of minimal type, 

0 
and thus is dominated, for large « by Ae, « arbitrarily small > 0. The 
minimal property indicates that the contributions of es,0, (w=1,- - -,t—f), 
are to be compared according to the value of the exponentials e-?°-*, For 
k =f + w these are negligible for large x, and accordingly these terms as well 
as those arising from pi.s,* * *, Px are unimportant. 

Consider now the third term under the summation sign in Eq. 2.1. The 
branch cut for b; may be taken as a straight line inclined at an angle 3 with 
the real axis 37/2 > 0 > 2/2. For ease of exposition alone, 3 will be taken 
as 7 in the work below.* The contour around the cut may be taken as made 
up of the part of the upper and lower edges of the cut terminating to the left 
of b; within the circle of radius r; and a loop, denoted hereafter by Ci, around 
the branch point and lying entirely within this circle to complete the cycle. 
The transformation z = p— 6; brings the branch point 6; to the origin and 
introduces the factor On writing — ®R(z) > with 
arg z = t or — i for the upper and lower boundaries of the cut, the absolute 
value of the contribution due to these parts of the dissected contour may be 


exhibited as 


@) 
(2. 2) | 1 ef F(ret™ + b,) — F(re-*™ + ]dr | S 
p 


since F'(p) is bounded on the cut away from the branch point by 27K,. This 
is later shown to be negligible in comparison with the other terms in the final 
developments so that the behavior of F(p) in the immediate neighborhood 
of the branch points determines the result. 

For the term 2"-‘a;, the loop integral around the branch point may be 


written, on making the substitution zz = — u, as 
(2. 3) (— u) idu 


where ©’; is the Hankel loop starting from the point on the upper edge of the 
cut (in the w plane) of abscissa pix and passing counter clockwise about the 


® Indentations to avoid possible singularities are tacitly neglected since their sole 
effect is at most a change in K, of Equation 2. 2. 


Th 


{ 

orl, 
ap] 
(2. 
Fo. 
| her 

by 
eas 
(2. 

Eq. 
(2. 

whe 
4 infe 
| (2 
(2, 
whe 
4 of 
Eq. 
q exp 
def 
4 

| 


THE HEAVISIDE OPERATIONAL CALCULUS. 493 


origin to the point just below on the lower edge. For x large enough this 
approaches 

Gin bia n 
(2. 31) sin aye T(n + 1— (—1)". 


For identification as a term in the second expansion in Eq. 2 one need remark 


here and later that 


sin sin (%; — nr) 


(—1)"= = [T(1— (a —n) 


The maximum difference between Kq. 2.31 and Eq. 2.3 is found, essentially, 
by bounding f ereyn-4idr, For sufficiently large @ this difference is then 
p 


easily shown to be inferior to 
(2. 32) Kote 


We wish to show now that the series defined by the sum of terms of 
Eq. 2. 31 exclusive of the e’*” factor, is an asymptotic series, namely that 


1 

1 


(2.4) 


M 
im sin (n + 1 — (— 1)" | 0 
0 | 
where yi (z)z is written instead of F(z-+ b;) since evidently the loop in- 
tegral for $i (z) is 0. 

The expression under the absolute value signs in Kq. 2.4 is surely 
inferior to 


j | 


. 
(2. | (z)dz | + Korte + Kye 
| 


where wi(z) en is the remainder after subtracting off the first 7 + 1 terms 
of the series expansion of ¥i:(z). Our immediate problem is to show that 
Eq. 2.5 is at most o(a%-™') for r— a. Hence the last terms in that 
expression is unimportant. 

Evidently y(z) is analytic for |z|<p; and so < Ky. We may 
deform the loop into a circle | 2 | =oi/x, oi < pi and the upper and lower 
edges of the cut in the range — pi S (z) S — ai 

The contribution from the integrals along the cut is less than 


* Whittaker and Watson, Modern Analysis, 3rd edition, p. 244. 


494 D. G. BOURGIN AND R. J. DUFFIN. 


sin 


| 


K; | ety dp 


= K, 


Similarly the integral around the small circle is inferior to 
(2. 61) K 


The dominants found in Eqs. 2.6 and 2.61 guarantee that Kq. 2. 5 is at 
most o(a%*“-") for x—» «. The asymptotic character of the expansions con- 
sidered has thus been established. 

According to a simple extension of a classical lemma due to Jordan the 
integral on the infinite semi-circle vanishes when account is taken of (b). 
Furthermore, since the dominants given in Eqs. 2.2 and 2.22 involve an 
exponential decrease faster than that of terms with an e’” factor, it follows 
that the value of the Mellin integral in Eq. 2. 1 is asymptotically approximated 
by the expansions in the statement of the theorem. The assurance that the 
Mellin integral really exists follows on the observation that the terms under 
the summation sign in Eq. 2.1 remain finite for a divergent sequence of 
sufficiently large semi circles erected on ® (p) =c and that Eq. 2.1 is valid 
for each of the resulting closed contours. 

The proof is complete.’° 

If F(p) is restricted to correspond to a real function f(z) (of the real 
variable x) then ain and Byn are real and each exponential of imaginary 
argument is replaced by a sine or cosine. This follows immediately on re- 
marking that then 


RE (p)—RE(p);  &F(p) ——AF(p) 


which imply that b;, e; and p; occur in conjugate pairs. 
Volterra composition, namely 


P,(p)Fa(p) f° de 


10 This clears up the doubts arising in the minds of some of the Heaviside followers, 
for instance those of Carson, concerning the validity and applicability of the asymptotic 
Heaviside expansions. Cf. Carson, Electrical Circuits, p. 84. Carson’s difficulties are 
but imperfectly answered in Levy’s paper, ibid., and the main point is not touched. 
The situation is, of course, summarized in Equation 2. March, Bulletin of the Mathe- 
matical Society, vol. 33 (1927), p. 311, has utilized a similar method for a rather 
restricted case. He does not, moreover, give any exact sufficient conditions for validity 
and omits the demonstration of the asymptotic property of the development. 


fo 

id ( Re 6 ) | | 

th 

W 

(b 
th 

(! 

it 

bu 

(3 

fo. 

see 

(3 

q Be 

(he 

i 

(1 

| 


THE HEAVISIDE OPERATIONAL CALCULUS. 495 


follows under suitable restrictions from Eqs. 1.1 and 1.2 when it is recalled 
that f:(z) and f.(x) vanish for negative values of their arguments. Here 
we are interested in the (inverse) composition on the p functions, namely 


% e+) 


= 
THEOREM 2. Jf f,(x)e-*, belong to L.(0, 
(b) 


then ™* Hq. 3 is valid and F,(p) is analytic for 0 (p) satisfying (b) is at worst 
for|q|— 0,q = &(p) and the Mellin integral with F;(p) is summable 
(1 to f,(x)fo(x)* wherever this has meaning. 


The proof is immediate, for it 


(3.1) F,(o + it) Ode with > 
0 


it is well known that F;(o + it) is not only analytic in the half plane o > ¢ 
but that 
(3.2) f |? dt < 


This guarantees that F',(c + it) and + (c + it)) are Fourier trans- 
forms of class Z. on the ordinates provided c > c, d—c>c. It is easy to 
see that the analogue of the Parseval identity becomes 7° 


A 
Lance P,(¢ + it) + ig — (¢ + it) dt 
-A 
c+iA B 
c-iA 


== 


The first integrand evidently belongs to L, by the Schwarz Inequality. The 


1 Somewhat similar theorems, involving different conditions, in connection with 
Dirichlet Series, occur in the recent literature. Cf. D. V. Widder, American Journal 
of Mathematics, vol. 49 (1927), p. 321, for the case f,(v) absolutely integrable; V. 
Bernstein, Series Dirichlet, Appendix I for the case f,(#) analytic in sectors. 

% Paley and Wiener, “ Fourier transforms,’ American Colloquium Publications 
(hereafter designated P. W.) Theorem 5. 

** For instance by paralleling the steps in N. Wiener, Acta Mathematica, vol. 118 
(1930), p. 55. 


496 D. G. BOURGIN AND R. J. DUFFIN. 


last integral may be written f g(xyer™iDdr, g(x) C L,(0, ©) where 
J 0 


h =d—c,—c, > 0 and g(z) is the product of two functions each of which 
belongs to L2(0, « ) 


(3. 4) g(x) = (x) ) ). 


Accordingly, the integrals in Kq. 3. 3 not only exist but, by Eq. 3. 4, define an 
analytic function ** F'3(p) in the half plane determined by (b). In the event 
that the first integrand of Eq. 3.3 or g(x) also belongs to Lz then, of course, 
F;(p) belongs to Lz as well. 

The deduction ¥s(p)| =0(1), ®&(p) =d is correct whenever 
(here g(x)) of Eq. 1.1 belongs to L£,(0, 0), viz: 


| fa de | 
0 


0 
For A sufficiently small the moduli of the first and last integrals on the right 
are inferior to e uniformly in g. The second integral goes to 0, by the Riemann- 


| f g(x) da 
A 


Lebesgue lemma, for | q|—> 
The remarks in the introduction make it clear that C 1 summability follows 
by direct extension from the known Fourier integral result since g(a) C J). 
On making use of the relation e”F (p) = f(x—A2) we may write Eq. 3 
in a form convenient for many applications. 
1 c+i00 


(3. 01) — (p — z) F2(z) dz 


J ¢-ico 


=f 


The analyticity of F;(p) in the right half plane indicates that the singu- 
larities of F,(p—2z) lie to the right of (z) = c and those of to the 
left. In the special case of polar singularities and F.(z)F,(p—z) =0(2") 
uniformly in argz—c on either the right or left infinite semi-circle con- 
structed on the diameter 0 (z) —c for instance it is possible to close the 
contour at infinity and to contract in such wise as not to pass any singularities 
included in the interior. Thus the closed contour contains all the polar singu- 


larities of just one of the functions involved.*® 


14§. Bochner, Vorlesungen iiber Four. Int., Leipzig, p. 145. 

15 This special case comprehends the usual Heaviside rules. =], e-az, an 
(n integral) leads to the Cauchy formula, to F,(p +a) and (—d/dp)n respectively 
For non-integral n in the last example Theorem 2 provides the basis for a theory of 
fractional differentiation and integration of p functions comparable to that known for 


~ 


the x functions. In this connection compare Equation 5. 1. 


ari 


J 


ger 
fro 
Sir 
tha 


i 

de 

| 
It 

(4 
TI 

va 

( 

val 

ift 

wh 

ore 

! ‘ 

| 

| 

i 


h 


THE HEAVISIDE OPERATIONAL CALCULUS. 497 


The following simple identity may be made the basis for many novel 
developments as well as for a number of formal results already in the Heaviside 
literature. 


co 
(4)*° ert, (x) (p2 + (x) Fy (p: + x) dz. 


It is convenient to use oi (z) = e*f;(z). Eq. 4 is then obviously valid when 


The indicated inversion is justified for hypotheses such as 1" 


THEOREM 3. ¢;(z) is integrable L, over any finite closed range of posi- 
live z values not including 0 or « and either side of Eq. 4 exists for absolute 


values of the integrand. 


A simple extension of Titchmarsh’s theorem 7° 2. 62 suffices. 


A co 
THEOREM 3A. f | da, | p2(A)| dAD<A< ow, i (A)dr 
0 0 J 9 


(1 =1,2), exist as Riemann integrals.’* 


‘eas f,(@) and p,; are treated as real for the proofs below. However, the results are 
valid generally, on splitting up the integrand into the four combinations 
X (A), 
if the hypotheses are satisfied for the separate products. The normal case in operational 
theory involves only p; complex in which case Theorem 3A alone is affected. 


*7The van der Pol result J F(p)dp = f f(a)/xda is the special case for 
0 0 


which one of the functions is 1 and p,=p,=—0. This identity has been of extra- 
ordinary utility in the work of B. van der Pol, loc. cit. and later Philosophical Magazine 


papers. The Riemann-Stieltjes equation 
oc 00 
0 0 


arises on taking one of the ¢’s as e-hy. 
18. C. Titchmarsh, Theory of Functions, Oxford Press. (Hereafter designated T.). 


1°This theorem admits cases such as ¢,(a#) = ezeetsinece*, For this function 
ie. 
f e-«pp, (x)dax converges conditionally, only, for p=0. Eq. 1.2 with F,(p) requires 
0 


generally a summability interpretation. For summability C1 the validation follows 
from a result of Hardy’s, G. H. Hardy, Messenger of Mathematics, vol. 47 (1917), p. 178. 
Since the central inequality, Eq. 4.2, does not require uniform convergence, it is evident 
that the hypotheses of Theorem 3 and 3A may be further considerably weakened. 


= 


498 D. G. BOURGIN AND R. J. DUFFIN. 


The proof is straightforward. We assert that 


A xX 


Under our hypotheses both iterated integrals exist for absolute values of the 
integrand. Hence the integrations may be interpreted in the sense of Lebesgue, 
but then the integration order is immaterial *° and this result must hold also 
for the Riemann interpretation.” 


After a preliminary integration by parts, it is easy to demonstrate the uni- 


ie, 
form convergence of f .(A)edA for x20. Accordingly 
0 


for (a) A fixed = A, and all X, (b) X fixed and all A= Ag. 
The absolute integrability of ¢2(A) over finite ranges justifies the assertion 
that for fixed A, X, exists such that 


X 


Kgs. 4. 1, 4. 2 and 4. 3 are sufficient to establish the validity of the change 


22 


in integration order *? involved in Eq. 4. An obvious generalization to cover 


A 
the case that f | 2(A)| dA exists for] > 0, A < © only is included by inter- 
l 


changing the réles of 0 and o in the above proof. 
THEOREM 3B. ¢;(z) belongs to (0,0), 1<kS2. 


The first half of the conditions of Theorem 3 are easily shown to be 
satisfied for integrability Z;, implies integrability Z, over finite ranges. We 
show now that the last condition of Theorem 3 is also met. 


If yi(z) e* | (x)| dx it is known ** that 
0 


oo k’ /k 
f | pi(z) |" dz <e (f | ) c< 


20T., Theorem 12. 6. 

31T., p. 340. 

22, W. H. Young, Cambridge Philosophical Transactions, vol. 21 (1910), p. 48. 

*°G. H. Hardy, Journal of the London Mathematical Society, vol. 8 (1933), p. 114. 


A 
<af | ne for all X = X,,. 


| 
an 
| de 
| sel 
ca 
int 
th 
7 be 
| 
wl 
| in 
fo 
(a 
if 


THE HEAVISIDE OPERATIONAL CALCULUS. 499 


and 1/k + 1/k’ 1. It may easily be shown that yi(z) is continuous for 
2> 0, hence ¥i(z) is measurable and therefore yi(z) belongs to Ly (0, 0). 
By Holder’s inequality 


$2(2) [Adz) < 0. 


Thus the sufficient hypotheses of Theorem 3 are implied by those of 
Theorem 3B. 

As a first direct application of Theorems 2 and 3, we present an in- 
dependently interesting development of a calculus, conjugate, in a certain 
sense, to that of Heaviside. Negative integral powers of the variable are not 
comprehended ** by our representation (Eq. 1.1 and Kq. 1. 2) of the Heaviside 
calculus. However, an operational theory can be stated for such functions on 
interchanging the meaning of operator and variable in Eqs. 1.1 and 1.2 so 
that g and x correspond to the previous 2, p respectively. The new variable 
(x) now ranges over the entire complex plane. An operational expression may 
be interpreted simultaneously according to both the Heaviside and the con- 
jugate symbolic theories on decomposing the operand 


f(z) + ¥(q) F (2) 


where g(x) =f(x) + F(x) and the notation for the functions indicates the 
sets to which f(x) and F(a) belong. This decomposition is no longer unique 
when fractional powers are present for a distinction regarding domain of con- 
sideration must be made for 2’, 0 <v <1, for instance, accordingly as it is 
included in the Heaviside set or the set of the conjugate theory functions. 

The following algorithms are straightforward consequences of Theorem 3 
for functions fulfilling the restrictions stated there. 


1 
gF(2) = (@)F(@) 


(a) F(a) =(f wna) = fret F(t) dt. 


(From the viewpoint of Volterra composition there is a formal analogy here 
to the usual p“f(x) interpretation 


ie. = —Ha (= 


* Formally p log p = —«#-1, but this is not in the domain of Equation 1. 1. 


he 
n 


500 D. G. BOURGIN AND R. J. DUFFIN. 


with h(y) the unit function vanishing on the negative axis of reals. For the 
special case of the conjugate theory cf. (a) when z is real, the same com- 
position formula may be considered to apply, but h(y) now represents the unit 
function for negative real values. The composition property with the reflected 
unit function holds throughout the conjugate theory when z is real. Vide (b) 
and (c) below. 


(6) = (fe onan) 
+ 2)/T(n)da 


(5.1) 


The last result may be obtained operationally (for n a positive integer) by 

formally differentiating (2) with respect to —c. With c=0 

Eq. 5.1 may be interpreted as a fractional integral of a p function. 
Theorem III determines the function associated with f(z") at least for n 


integral. For n = 2 we start with 
fi(x) =exp(— A*/r)/4 = Fy (p) = exp(— Ap*®) 
whence quite directly 


(x) =f F.,(2?)exp(— p*) /2?dz. 


0 
The application to constant coefficient differential equations is direct 
and may be briefly summarized. Consider the differential equation with 


constant coefficients 


= L(d/dr)y = L(—q)y = F(z) ; 
(5. 2) lim y,- +, y"2—0, R(x) Sc. 


We may write, on the assumption of no positive real zeros, 


4) =f Where f(A) 


In accordance with Theorem 3 (or Eq. 5.1) this may be expressed as 


(5. 4) y(z) = +A)da 


where A(A) is the ordinary “ indicial” function on the Heaviside theor} 
corresponding to L(—A). It is well known in fact that 


te: 
4 
| 
4 
i] 
a 
4 | 
‘ 
id 
| 
| 
| 
| 
il : 


by 


THE HEAVISIDE OPERATIONAL CALCULUS. 501 


1 d 


if the zeros are distinct and not positive real. 
However, another viewpoint may be used in connection with Eq. 5. 3. 
Define 


then for the case F(x) = a" Kq. 5.3 reduces to 
d 
y(x) = B(x) = (ajar) / q)|a, 


for distinct not positive real zeros of L(— p). 

This is the analogue of the Heaviside-Carson “ indicial” function’s 
derivative. For the equation with a general F(x) the inverse composition 
process of Theorem 2 gives, as an alternative to the formula of Kq. 5.4, the 


solution 


(5.41) y(2) BOP@— bat 


The more usual formulation of the Mellin integrals is connected with the 


operator s = xd/dz and is expressed 


(1. 1 bis) do 
e 0 
d+i0o0 
(1.2 bis) g(v) v-*G(s) ds. 
d-ico 


Combination of the operators s and p may well be expected to have special 
interest in operational theory. Consider then 


Some striking inversion relations arise through the intermediation of 
Eq. 6. We write 


The nature of the reciprocal formula is indicated by the following purely 


formal developments 
Ag oo 

(6. 11) 5 v816(v) F(v) dv 
Ay 0 0 


if Theorem 3 applies, where 


~ 


he 
m- 
nit 
ed 
b) 
() 

n 
ct 
th 


502 D. G. BOURGIN AND R. J. DUFFIN. 


F(v) = f (a)de. 


Then 
d+ico 
$(v)F(v) = (1/2ni) f v-*y(s) ds 
and 
c+ico d+ioo 
f(z) = (1/277) (s)/b(v) dsdv 

(6. 2) (1/2ni) K(s, 2) ¥(s)ds 

with | 


c+400 
K(s,2) = (1/2ni) f do 


The rigorous validation of this mode of derivation of the important Eq. 
6. 2 presents difficulties because the integrals involved are generally not abso- 
lutely convergent.?® For the case ¢(v) = (1— 


v) T'(s)f(s, v) where v) (v n)-* 


n=0 


is the generalized zeta-function, Eq. 6.2 takes the elegant form 


d+ioo 
(6.21) f(x) = (1/2) — 1(4— 1) *""]y(s)ds 
d-i00 
1(z) =0 for z < 0, = 1 forz > 0, and 1/2 for z = 0. 
THEOREM 4. Fq. 6.21 ts valid if 
are interior to neighborhoods of bounded variation of f(x) and A, > 0, A2= 
and furthermore that f(x) belongs to L,(A, ©), provided that R (s) =d>1. 
Consider 


+iB +iB 
(6.3) (1/2zt) (s)ds = (1/221) f(A) f(s, A) dads 


5 d+iB 


(1/2ni) f f(A) (A + 


d-iB 


The inversion of order of integration is justified by the observation that for 


fo ~ 
A,e-Bxv then K(s, x) = SA, - (27 — B,)1(x— B,) formally. 
0 0 


7° In fact L|t|>00 y(d + it,p) may not exist. K. Ananda-Rau, Proceedings of the 
London Mathematical Society, vol. 19 (1920), p. 114. 


ac 


fo 


un 


or 


T 
| ve 
(6 
= 
| (6 
| 
4 
(6, 
q 
| 
ap 


THE HEAVISIDE OPERATIONAL CALCULUS. 503 


R(s) =d> 1, | €(s,A)| is continuous in A and s when A; [AS and 
accordingly the integrals are absolutely convergent. 
On carrying out the integration there results 


(6. 31) f. F(A) sin Blog (2/n + 


The term by term integration is correct for } x*(2/n + A)* is uniformly con- 
0 
vergent (in &(s)) for R(s) —=d>1,B< om. 
We may write 


+A)? 
wx log (x/n + A) 


for Nz (> @) sufficiently large, uniformly in B and A. Thus 


6.4) f 1700 


Since also 


sin (Blog (x/n+A)) | <a 


(z/n+A)* 
2 log (t/n+A) (B log x/n + A) | dA < 


N’ 0 
(6. 41) 
A 0 A 
uniformly in £ for finite NV’, x we may invert the operations in Eq. 6. 31 to get 


(z/n + 2)4 
~ F(A) log (x/n + A) 


or what is essentially the same thing according to Eq. 6. 4 


Nz (z/n + 
f(A) nx log + d) 


Accordingly, 
d+iB 
(6.32) Lg (1/2ni) f ds 
d-iB 


(a/n + A)4 
Lag A; log (4/n + dA) 


sin Bz 
= J — n) —— dz 
> f f( ) 


0 og a/nt+A, 


(6.6) (xe-* — n)| dz 
oga/n+Ay_ 
| +A)4| dA < 


sin (B log x/n + A)da 


sin (B log + dA) dd. 


sin (B log z/n + A) da 


Because of Eq. 6.6 and Eq. 6.41 the Riemann-Lebesgue lemma may be 
applied to show that the Dirichlet integrals in Eq. 6.32 vanish unless 
t=n-+A,. Accordingly, the limit of the first integral in Eq. 6. 32 is 


a 
| 
4 
at 
4 
{ 


504 D. G. BOURGIN AND R. J. DUFFIN. 


(6. 7) f(z) +f(e—1) +: -f(#—7) 
where « —r = A, > x —r —1, provided the function is of bounded variation 
in the neighborhoods of the arguments in question. 

Similarly 


d+iB 
(6.8) (1/271) 1(~— 1) (x 


Subtraction of the expressions in Eq. 6.8 from those in Eq. 6.7% yields the 
desired result. 

4A. If f(x) C ©), Eq. 6.21 ts valid in the sense that 
the right-hand integral is summable (C,1) wherever f(x +0) + f(x#—0) 
exists. 


We have merely to investigate 


The steps and reasoning of the argument are precisely the same in detail as in 
the proof above. The only change is that the resulting integrals (Kq. 6. 32 
for instance) are of Fejer instead of Dirichlet type. 

One immediate application is furnished by the Laplace integral equation, 
Eq. 1.1 where p is now a real variable, F(p) is supposed known for p= ph 
and f(z) is required. There is no fundamental restriction in assuming f(z) 
bounded and of class L,(0, 0 ).?’ 

For f(x) satisfying the conditions of Theorems 4 or 4A we may exhibit 
solutions in the form ** 


(1) f(a) = (Pani) 


f (ph (1—e*) )dpas 
or 


= | X(s)|, 
= Lin (1/2ni) f 


—1(4—1)(¢— (p)/T(s) (1 — e*) )dpds, 
f(z) =0 for < 


These solutions are easily established on noting that for (v) = (1—e*)” 


Wap. 
*® Evidently the general formal solution may be written 


1 o+400 fe 
K (8, x2) f v8-16(v)F'(v) dv. 
0 


us 

Tl 

(7 

by 

He 

(7 

or 

| (7 

Th 

su 

4 teg 

| | 

| 

inte 


on 


he 


at 


THE HEAVISIDE OPERATIONAL CALCULUS. 505 


Theorem 3 applies to the inversion indicated in Eq. 6.11. In fact it is manifest 
00 

| F(v)| < Me4/v and accordingly | ve (v) (1— dv is certainly 
0 


absolutely convergent for ®@(s) > 2. 

A solution, Eq. 7.4, formally somewhat similar to that given by Paley 
and Wiener *® (who, however, work in the domain of Z, functions) follows on 
using the specialization p—1, $(v) =v or y(s,p) =T(s+1) in Eq. 6. 
This may be rigorously established for f(z) belonging to Z,(0, 0) as follows: 


(7.1) f(a)/(1 + = [1/P(s + DIY, (A) dA = 
0 0 
by Theorem 3 and Eq. 5.1 for #&(s) >0. Writing «+ 1—e* 


(7.2) v(s) = (e# —1)dz. 

Clearly, 

Hence Kq. 1.2 applies and 
(7.3) (1/2ni) (s)ds 0, 


or 
(7.4) f(z) (1/2mi) (@ + 1)(s)ds 
The result holds when z is interior to a neighborhood of bounded variation 
for f(a). 


Here also a generalization is afforded by replacing convergence by summa- 
bility (C,1) or Sommerfeld type, in that the last integral of Eq. 7.4 is 
summable to f(a) when f(a + 0) + f(a#—0) has meaning. This observation 
hinges essentially on the fact that the summability property of Fourier in- 
tegrals since f C ZL, is patently directly extensible to Eq. 1.21 and hence to 
Eq. 7.3 and thus to Eq. 7. 4. 

UNIversITy oF ILLINOIS, 

URBANA, ILL., 
PurDUE UNIVERSITY, 
LAFAYETTE, IND. 

*°P. W., p. 37-39. (Here references to D. V. Widder’s work may be found as well.) 
The P. W. solution apart from its implication of L, function classes and an apparent 
integration order change, is essentially transformable to the type of Eq. 7.4 with the 


Specialization. 


NOTE ON FORMAL LOGIC.* 
By M. H. Stone. 


It has been observed that the theory of Boolean algebras assumes a 
particularly satisfactory algebraic form when developed in terms of the sym- 
metric difference a + 6 and the product a:b as fundamental operations: for 
Boolean algebras are then characterized as rings with unit in which every 
element is idempotent.' The close connection between Boolean algebras and 
the formal (Aristotelian) logic of propositions therefore suggests that a logistic 
system built up from corresponding operations would be of some interest, and 
would have the special advantage of reducing the proofs of most logical 
theorems to simple and essentially familiar algebraic calculations. In the 
present note we shall develop such a system, based on results of LeSniewski and 
Bernstein.” 

Propositions are to be regarded as abstract entities and denoted by the 
letters a, b, c,- - -. We postulate three primitive operations on propositions, 
each of which results in a new proposition; and indicate the propositions 
resulting from their application by a + b, a:b, and a’ respectively. We may 
read a+ b as “a if and only if b” or “a is equivalent to b”; we may read 
a:b as “aor b”; and we may read a’ as “nota.” To indicate that a particu- 
lar proposition a is to be placed on the list of asserted propositions we write 
ta. As primitive assertions, we postulate that for arbitrary propositions 
a, b, c (whether given directly or expressed as “ polynomials ” in terms of the 
postulated operations and other directly given propositions) 


(1.1) H[(a+b) + (c+a)]+ [+e] 
(1. 2) Kla+ +¢] 
(2.1) (b-c)] + [(a-b) 

(2. 2) K[(a+b)-c] + [(c-a) + (c-6)] 
(2.3) t(a-a)+a 

(3.1) t[(a+a’)-b] +0). 


* This note was written while the author was a Fellow of the John Simon Guggen- 
heim Memorial Foundation in residence at the Institute for Advanced Study as 4 
temporary member. Received by the Editors January 7, 1937; revised March 22, 1937. 

1Stone, Transactions of the American Mathematical Society, vol. 40 (1936), pp. 
37-111, especially pp. 39-48. 

? Lesniewski, Fundamenta Mathematicae, vol. 14 (1929), pp. 1-81; B. A. Bernstein, 
Annals of Mathematics (2), vol. 37 (1936), pp. 317-325. 


506 


NOTE ON FORMAL LOGIC. 507 


In order to bring other propositions upon the list of asserted propositions, we 
postulate the informal deductive rules 


(A) if +a and +a+b), then +); 
(B) if ta, then b. 
The application of these rules will be indicated by the schemes 


Ka 
rath 
Kb 


(B) 
We introduce a relation = between propositions as follows: 
DEFINITION 1. if 

We can then introduce two further operations either through the definitions * 
DEFINITION 2. -Fla&b] + [(a+)) + (a: b)], 
DEFINITION 38. [b+ (a-b)], 

or through the equivalent definitions 
DEFINITION 2’. a&b=(a+b) + (a:b), 
DEFINITION 3’. a>b=b-+ (a:b). 


The proposition a & b may be read “a and b,” the proposition a— b may be 
read “a implies b.” We shall see that the interpretations of the primitive and 
defined operations are all justified by subsequent results. 

We commence our investigation by considering the consequences of (1.1), 
(1.2), (A) and Definition 1. The system so described is due to LeSniewski.* 
We obtain the following fundamental result: 


THEOREM 1. In terms of the operation + and the relation =, taken as 
an equality-relation, the system under consideration is an additive abelian 


group in which every element is of order 2. If the zero element of this group 


® The usual form of definition would be to describe a & b, ab as abbreviations for 
(a+b) + (a.b), b+ (a-b) respectively. For comments on the present form, which 
is better suited to our later algebraic considerations, if not to the requirements of a 
strictly formal logic, see Tarski (Tajtelbaum), Fundamenta Mathematicae, vol. 4 
(1923), pp. 196-200, especially p. 197. 

* Lesniewski, loc. cit. We write + in place of his =. 


508 M. H. STONE. 


be denoted by 0, the statements a0 and }-a are equivalent. In particular, 
we have, for all a, b, ¢, d, 


if a=b, then b=a; 

if a=b and b=c, thena=c; 

tf a=cand b=d, thna+b=—c+d; 

(y) afb—b+a; (8) a+(b+c)—(a+b) +0; 


the equation «+ b=a has a+b as a solution. 


It is well known that the properties (#)—(8), together with the existence 
of a solution of the equation x + b =a, are characteristic for abelian groups.* 
They imply that the solution of 2+ 6—a is unique (in the sense that 
z+b—a and y+ b—a imply r=~y) and that the zero element 0 exists 
and satisfies the equation z + a —a for every a. Thus the property (e) above 
yields the special relation a +- a = 0 for every a; in other words, every element 
is of order 2. Accordingly, we need establish only the properties (@)—(e). 

We begin with several lemmas, as follows: 


(1.3) (a+b) + (b+); 
(A’) if ta+)b, then +b+a; 

(A”) if +b and +a+b), then ta; 

(1. 4) ta+a; 

(1.5) +a] +[(c+b) + (a+e)]; 
(1. 6) [b+ (b+a)]. 


Ad (1.3). Substituting b, a, b for a, b, c respectively in (1.2), and 
b,a+b, b+ <a for a, b, ¢ respectively in (1.1), we obtain 
the scheme 


[6+ (a+ 6)]+[(6+4) +8] 
+ (a+ + (a+b) + (6+4)} 


A 
(A) (a+b) + (ba). 
Ad (A’). Using (1.3) we have the scheme 
t+ 
(a+b) + (b+a4) 
Ad (A”). We now have the scheme 
+a 
(A) 
ta. 


5 See. for instance, van der Waerden, Moderne Algebra, Berlin, 1930, vol. I, pp. 15-19. 


A 
A 
A 
Tl 
al 
A 
A 
Ac 
Ac 
Ac 
Ac 


NOTE ON FORMAL LOGIC. 509 


Ad (1.4). Substituting 6, a for a, b respectively in (1.3) and b, a, a for 
a, b, c respectively in (1.1), we obtain the scheme 


(b+ 4) + (a+ 5) 
+4) + + [a+] 
a. 


(A) 


Ad (1.5). Substituting c, b, a for a, b, ¢ respectively in (1.1), we obtain 
the scheme 


cary 0) + (a+ + [b+ a] 
Ad (1.6). Using (1.3) and substituting a, b, b +a for a, b, ¢ respectively 


in (1.2), we obtain the scheme 


(a+b) + (b+a) 
[b+ (b6+4)]} + ((a+ 5) + (b+4)} 
a+ [b+ (b+a)]. 


This completes the proof of the lemmas listed above. 
We turn now to the properties («)—(e), taking them up in a somewhat 


(A”) 


altered order. 

Ad (a). By (1.4) and Def. 1, we have a =a. 

Ad (@’). By Def. 1, means +}a+b. Hence implies +b +a 
by (A’) and thus b =a by Def. 1. 

Ad (a). Ifa—b6 and b=c, then +}b+aand tc-+b by and Def. 1. 
Hence, by using (1.5), we obtain the scheme 


t b+a 
(A )] 
+ (a+e) 


kate. 
Thus ac by Def. 1. 
Ad (y). By (1.3) and Def. 1, we havea +b=b+a. 
Ad (8). By (1.2) and Def. 1, we have a+ (b+ c¢) = (a+ +6. 
Ad (B). If ac, we have ta+e by Def. 1. On substituting c, a, b for 
a, b, c respectively in (1.5), we obtain the scheme 


+a) + (e+ 5). 


510 M. H. STONE. 


By Def. 1, we have b +a—c-+); and by (y) and (2) we 
infer thata+b—c-+ b. If b =d, we can substitute b, c,d 
for a, b, c, respectively in this equation, obtaining b + ¢—d-+-c. 
Applying (y) and (#”), we infer thate +b=c+d. Thusa 
final application of 

Ad (ec). By (a), (8), and (y), we have (a+b) +b—(b+a)+b. By 
(#”) and (y),; we then have (a+b) +b=—b+ (b+4a), 
On the other hand, (1.6) and Def. 1 yield a=b + (b+), 
Hence (@’) and (#”’) yield (a+b) +b=—a. 


We still have to establish the equivalence of a0 and fa. From the 
preceding results, we know that a = 0 if and only if a—a- a; and that the 
statements a—=a-+a and +-a-+ are equivalent. Now, using (1.4) 
and substituting a, a for a, b respectively in (1.6), we obtain the schemes 


Ka 
any bat (a+a) [a+ (a+a)] 


Hence the statements +a and ta-+ (a-+ a) are equivalent. It follows that 
the statements a = 0 and a are equivalent. 

We next consider the effect of introducing (2.1), (2.2), (2.3) and (B) 
into the system studied in Theorem 1. We obtain the following fundamental 


result: 


THEOREM 2. In terms of the operations + and - and of the relation = 
of Definition 1, the system under consideration is a Boolean ring—that 1s, 
a ring (necessarily commutative) in which every element is idempotent.® in 


particular, we have, for all a, b, c, 


if a=c and b=—d, thena-b=c-d; 
(£) a-b—b-a; a-(b-c) = (a:b) 
(x) (b+ = (a:b) + (a-e); 


(A) a-a=—a. 


It is well known that the properties («)—(8), (7)—(«), together with the 
existence of a solution of the equation z+ ba are characteristic for 4 
commutative ring.” We may remark that (A) implies a+ a—0 and hence 
(ec): for obvious applications of (a), (B), (€), («), and (A) yield 


® Stone, loc. cit. 
7 See, for instance, van der Waerden, Moderne Algebra, Berlin, 1930, vol. I, pp. 36-40. 


a 
( 


le 


NOTE ON FORMAL LOGIC. 511 


=[a+a]+[a+a]. 


We now discuss the indicated properties in a somewhat altered order. 


Ad («). By (2.1) and Def. 1, we have a: (b-c) = (a:b) 

Ad (A). By (2.3) and Def. 1, we have a-a—<a. 

Ad (€). By (A) we have (a+ 5): (a+b) =a+b. By (2.2) and Def. 1, 
we have (a+ (a+b) =[(a+ b)-a]+[(a+ -}], 
(a+ b)-a=(a-a)+ (a:b), (a+b)-b=(b-a)+ (b-bd). 
Applying (@’), (a), (B), (y), (8), and (A) in an obvious 
manner, we therefore obtain [ (a: 
Theorem 1 now shows that (a: 6) + (6:a) =0 and hence that 
a-b=—=b-a. 

Ad (x). Substituting 6, c, a for a, b, c respectively in (2.2) and applying 
Def. 1, we have (b+ c¢)-a=(a-b) + (a-c). Then by (€) 
and (#”’), we have a: (b+ c) = (a-b) + (a-c). 

Ad (yn). Ifa—c, then by Def. 1. On substituting a, c, b for a, b, ¢ 
respectively in (2.2), we therefore have the scheme 


a+c 
cay Lae) + + (b-6)] 

t+ (b-a) + (b-c). 
By Def. 1, we then have b-a=b-c. If b =d, we can sub- 
stitute b, c, d for a, b, c respectively in this equation, obtaining 
c-b=c-d. Applying (#’) and in an obvious way, we 


obtain —c-d. 


We observe that the informal rule (B) has been used only in the proof 
of (y). It is therefore of particular interest to note further that (B) can be 
deduced from (1.1), (1.2), (A), (2.2), and (7), as we shall now show. If 
ta, then a0 by Theorem 1; and a=a- a, also by Theorem 1. By (@), 
and Def. 1, we therefore have [a:b] + [(a-+a)-b]. Hence, on sub- 
stituting b-a for a in (1.4) and a,a,b for a,b,c respectively in (2.2), we 
obtain the scheme 

(b-a) + (b-a) 
(A”) K[(a+a)-b]+[(b-a) + 

t (a+a)-b 

+-fa-b] +[(a+a)-b] 


( A ” ) 


512 M. H. STONE. 


Thus (7) and (B) may be regarded as equivalent with respect to the primitive 
propositions and the informal rule (A) of the system. 

We now introduce the last of our primitive propositions, (3.1). This 
proposition is due essentially to Bernstein.* We then have 


THEOREM 3. The postulation of (3.1) is equivalent to the postulation 
of a unit e in the Boolean ring of Theorem 2 together with the definition 


a =a+e. 


By (3.1) and Definition 1, we have (a + a’) -b =b for every element b. 
The element a+ a’ thus has the properties of a unit in the Boolean ring of 
Theorem 2. Since two units in a commutative ring are necessarily equal,’ 
we see that the unit a+ a’ is independent of a. Denoting the unit by e, we 
therefore have a + a’ =e; and we conclude by Theorem 1 that a’ —a- e for 
every a. On the other hand, if the Boolean ring of Theorem 2 has a unit e and 
a’ =a- e, we can apply («), (a), (#”), (B), (8), (€), and (») to obtain 


(ate)]-b=[(a+a) +e]-b=[0+e]-b 


=); 


and Def. 1 then yields (3.1) [(a-+a’)-b] +5). 
The results obtained in Theorems 1, 2, and 3 may now be inverted as 


follows: 


THEOREM 4. In an additive abelian group, with 0 as tts zero element, 
let the truth of the equation a =0 be indicated by ta. If this group has the 
property that every element is of order 2, then (1.1), (1.2), and (A) are 
theorems ; if this group is a Boolean ring under a suitable multiplication, then 
(1.1), (1.2), (2.1), (2.2), (2.3), (A), and (B) are theorems; and, tf this 
group is a Boolean ring with unit under a suitable multiplication, then (1.1), 
(1.2), (2.1), (2.2), (2.3), (3.1), (A), and (B) are theorems. In each of 
these cases, a=b if and only if a+ or Fa+b. 


The proof may be left to the reader. 
In order to illustrate the demonstration of logical theorems, in accordance 
with the principle established in Theorem 1 that the statements +a and a = 0 


are equivalent, we give the following result: 


TuerorEM 5. In the logistic system under consideration we have for 


all a, b, ¢ 


B. A. Bernstein, loc. cit. 
® See, for instance, van der Waerden, Moderne Algebra, vol. I (Berlin, 1930), p. 40. 


i 
if 


NOTE ON FORMAL LOGIC. 513 


(4. 1) + (a 
(4. 2) La—(a—>b); 
(4. 3) +la>b] > [(b->¢) (a> c)]3;— 


together with the informal deductive rule 

(C) if taand then 
Corresponding to the informal rule (C) we have for all a, b 
(5.1) 


According to Definitions 2, 3 (2’, 3’) and Theorems 1, 2, 3 we have to 


establish the algebraic identities 
(4.1*) a+ [a+ (a+e)-a]-a=0, 
(4.2%) (b+ [(at+e)-b]) + (a: (64+ [(a+e)-b]}) =0, 
(4.3%) [{o+ (a-c)} + {Le+ (b-c)]- [e+ (a-0)]}] 
+[(O+ (a-b)) (Lo +(a-6)} + {Le 6) Lo +(a- 6) ]}) 
(5.1%) b+ [ ({a+ [b+ (a-b)]} + (a: [b + (a-b)]}) =0, 
together with the rule 
(C*) if a=0 and b+ (a:b) =0, then b =0: 


Since the operations are to be carried out in a commutative ring with unit e, 
we can drop all brackets 


we can expand these expressions in a familiar way 
(using the convention that multiplications take precedence over additions), 
we can write the factors of any product in alphabetical order, and we can 
drop e as a factor from any product in which it occurs. Our alleged identities 
then assume the respectively equivalent forms 


(4.1**) ata-ata-a-a+a-a=0, 


(4.2%*) 


(4.38**) 


(5.1**) 


514 M. H. STONE. 


By application of the special rules a-a—a, a+a=0, these relations are 
"geen to be identities, as we wished to prove. As to (C*), we note that a= 0 
implies b+ (a:b) =b+ =b+ and hence that and 
b + (a:b) =0 together imply 6 = 0. 

It has been shown by Lukasiewicz and Tarski ’® that a complete logistic 
system for the Aristotelian logic of propositions can be based on the primitive 
operations — and ’ with (4.1), (4.2), (4.3) as primitive propositions and 
(C) as the sole informal deductive rule. Hence the system discussed here 
contains all of the ordinary logic of propositions. Our system has a similar 
relation to that of Russell and Whitehead, whose primitive operations corre- 
spond to our- and’. On the other hand, it can be shown that the Lukasiewicz- 
Tarski and Russell-Whitehead systems contain ours (under suitable definitions 
of + and ~- when not taken as primitive). Since this aspect of the situation 
is quite familiar, we do not go into detail. 


HARVARD UNIVERSITY. 


10 Lukasiewicz and Tarski, Comptes rendus de la Société des Sciences et des Lettres 
de Varsovie, Classe III (1930), pp. 30-50; Tarski, Fundamenta Mathematicae, vol. 25 
(1935), pp. 503-526, especially p. 506. 


ne 
th 
In 
orl 
ser 


pr 


001 


the 


pol 


of 


= 
} 
i of 
A. 
jeu 
“ D 
Ma 


A CASE OF COLORATION IN THE FOUR COLOR PROBLEM.* 


By C. E. Winn. 


In studying the four color problem we assume‘ a map divided by a con- 
nected trihedral network into a finite number of polygons. Errera * has shown 
that, if no polygon has more than 6 sides, such a map is reducible, i.e. its 
coloration can be made to depend on that of one or more maps of fewer polygons. 
In his treatment, however, the reduced figures are not generally maps of the 
original type, as they may contain polygons of more than 6 sides. Con- 
sequently, the resulting map cannot be further reduced by the same method. 

In the present paper we shall obtain reductions or sets of reductions which 
preserve the type of map in the reduced figure, with a view to proving that 


I. A map S containing at most one polygon of more than 6 sides can be 


colored. 


We shall start by obtaining new reductions which, in conjunction with 


those already known, may be embodied in the result 


II. Any polygon of less than 7% sides in an irreducible map must touch a 
polygon of more than 6 sides. 
We now quote the known reductions required here, giving an explanation 


of how the reduced figures are formed. 
A. A polygon of less than 5 sides.* 


The reduction is made by removing any side; only, in the case of a 


* Received March 31, 1937. 

1 For a general account of the subject see Sainte-Lagué, “ Géométrie de situation et 
jeux,” Mémorial des Sciences Mathématiques, fase. 41, and the thesis of M. A. Errera, 
“Du coloriage des cartes, ete.,” Bothy, Ixelles, 1921. 

?“Une contribution au probléme des quatre couleurs,” Bulletin de la Société 
Mathématique de France, vol. 53 (1925), p. 42. 

* A. B. Kempe, “ On the geographical problem of the four colors,” American Journal 
of Mathematics, vol. 2 (1879), p. 198. 

515 


| 


516 C. E. WINN. 


quadrilateral in a 2-ring,* we must remove a side bounding the ring, in order 
to avoid the creation of an isthmus.’ 

It will be seen at once that the reduction of a digon and a triangle 
diminishes the vertices of the adjacent polygons by two and one respectively, 
while that of a quadrilateral deprives of one vertex the two polygons abutting 


the suppressed side. 


B. A ring of 5 polygons or fewer enclosing more than one polygon. 


The reduced maps for a 2- or 3-ring are formed by suppressing the part 
of the map on one side of the ring. 

Two of the reductions for a 4- or 5-ring are made in the same way; and 
the others are obtained from them by a further removal of two non-adjacent 
sides of the quadrilateral or pentagon newly formed (i. e. including more than 
one polygon of the given map). 

The reductions A and B are fundamental in as far as they ensure the 
presence of two successive rings about each polygon of an irreducible map, 
without which all other known reductions and those given here might fail, 


should an isthmus occur in the reduced figure. 


C. A polygon completely surrounded by pentagons.® 


In the reduction the polygon coalesces with alternate regions on the fur- 


ther side of the ring, except the last two when the number is odd. 


D. A polygon bounded by hexagons and pairs of pentagons, if not by an 
odd number of hexagons only." 


In the reduction the whole ring is suppressed except the sides joining the 
free vertices (i.e. belonging to one polygon only of the ring) of each 
hexagon and pair of pentagons. 

It may be remarked that the reductions of a pentagon or hexagon flanked 


* A ring may be defined as a cyclic sequence of polygons each of which touches that 
before and after it, but no other one, in the sequence. The polygons of a 2-ring have 
2 separate contacts. 

® An isthmus occurs at a boundary which can be crossed once only by a closed 
circuit not meeting the network again. 

°G. D. Birkhoff, “The reducibility of maps,” American Journal of Mathematics, 
vol. 35 (1913), p. 116. 

* Birkhoff, loc. cit. (8) and P. Franklin, “The four color problem,” American 
Journal of Mathematics, vol. 44 (1922), p. 225. 


b 

0 

0 

0 

fe 

a 

W 

‘ 

t( 

| b 
H 

i 

4 a 

f d 

| 

. 

] 

b 

p 

8] 


A CASE OF COLORATION IN THE FOUR COLOR PROBLEM. 517 


by 3 pentagons along adjacent sides are not employed here. In fact the latter 
might involve more than one polygon of 7 sides or more in the reduced figure. 

In order to establish II, we must show how to reduce any ring composed 
of pentagons or hexagons about a pentagon or hexagon. Actually we need only 
consider those rings in which the pentagons are isolated. For, following a 
remark of Franklin,® we may derive the reduction of a ring containing a pair 
or pairs of pentagons from that of a ring obtained on replacing them by pairs 
of hexagons, assuming every pair reduced as in D. The derived cases are 
given in brackets at the head of each reduction. 

With a view to saving space our reductions are mainly set forth in tabular 
form referring to the appropriate figure. Under numbered variants are given 
all essentially distinct groups of the colors 1, 2, 3, 4 bounding the reduced 
configuration, i.e. which are neither permutable nor symmetrically equivalent. 
Obvious abbreviations are employed, such as b = 1 to mean that the polygon b 
bears the color 1. 

If the solution for any variant is immediate, we mark the color of the 
central polygon, from which the ring can be filled in without difficulty. Other- 
wise one or more similar color chains are assumed (1st column), admitting 
an alternation of the intermediate complementary chain or else a change due 
to the absence of former chains (2nd column). There are now two possi- 
bilities : 


(a) The new distribution may lead to a direct coloring (3rd column), 
in which case we may suppose the assumed chain absent, and replace the 
previous change by one affecting a polygon named in the first chain, at the 
same time allowing for consequent modifications elsewhere (4th column). If 
a direct solution is not yet available, we may assume another chain in the new 
distribution, and so on until a coloring is finally obtained. 


(b) The change first made may not yield a direct coloring. We then 
examine one or more chains occurring in the derived scheme, continuing as 
in (a). The digression is distinguished from the main case by means of 
brackets, which are closed as soon as a solution is forthcoming. In fact a com- 
pleted bracket is tantamount to a direct coloring of the previous scheme. In 
complicated cases it may be necessary to insert more than one bracket in 
succession. 


* Loc. cit (8), p. 232. 
6 


518 C. E. WINN. 


E. nd5665 (n> 5). See fig. 1. 


Fig. 1. 


Note that, if a were a pentagon, an isthmus would occur in the reduction. 


(1) a=1,d-~f ord=f—2 == 2 
(2) a=1ld=f=3 


23 b tof c=e=4 “= b= 8; dh —2or3| 
Uu=3 | 
(3) a=3,d~f ord=f=—4 


(4) am 3, danf 


24btohord a=lorc=3 U=2 b=4,f—2or4 


(5) a=3,d=f=3 


24b toh a=] (1) b == 4 
u=1 | 


F. 56666 (or 56655 or 56556). See fig. 2. 


2 
| 
2 al 
| 
H 
| | 
| 
a 
| 
e 1 
Fig. 2. 


A CASE OF COLORATION IN THE FOUR COLOR PROBLEM. 519 
(1) d=2 u=1 
(2) d=4, fhA~ 43 
(3) d=—4,f—4,h—3 


| 42d toa bo =m 13 
u= 
(4) u=1 
| 
| 32dtob c=—=4 d = 2 


(6) d—=3, f—h—2 


42 f toa g=i=3 

32 dtob == 4 uU=é d = 2 

42 ftodora e=3,i—lor f=4;d,h=—2 or 
= 2, h =2 or 4 
i= 


The next reduction is due to Mr. Choinacki. 


G. 66666. See fig. 3. 


/ 
‘ 


— 
QQ \ 
7 
: 
‘ H 


(1) cegi not all ® 2 u=1 


°Cp. the reduction of 666666 about a hexagon by Birkhoff, loc. cit. °. 


Mig. 3. 


520 C. E. WINN. 


(2) 


24a toe b=d=—3 
[237 toa j=4 
{24etocorgd=lorf=3 u=2 
24etoa | 
C= g=4 | 
(23 a tot j=1 a=3, e=2 or 3 
u=1) e=4 
43; toborda=1,c=—2orl u—1 j= 3 
24itogoreh=3,f—lor3 u=3 
24itoc 1 i=4,a—2 or 4 
u= 3} g == 3 10 
24 etoa d=b=—1 u=1 
24 etoc d= 1 
(31 btodorjc ora=4 u=1 b=1 
u=1) e=4 
3] a=4; ¢,g,1—2or4 
H. 566666 (or 566655 or 565555). See fig. 4. 


Fig. 4. 


(1) If d= 2, we get a direct coloring with u—1. 

(2) If d= 4, we may suppose the 42 chain from d to a present. Hence 
we have 423 or 342 for lmn, directly or after inverting 31 of bc. One of these 
yields a solution with w= 1, whatever be the colors of f, h, 7. 


10 By symmetry the chain 23 from e to g is also absent. 


3] 
Qa 2 

‘ m d 
Q 
i 
Pes / 1 
¢ ‘ 
rat 


A CASE OF COLORATION IN THE FOUR COLOR PROBLEM. 521 


(3) If d=3, the coloring is direct, provided f =4. And, when f = 2, 
the presence of a 24 chain from a to f allows the inversion bcde = 1813. 
Again one of the alternatives affords a solution for any colors of h and j. 


(4) If d =f we get a direct coloring for h = 2 or 3. When h =4, 
we proceed as in (2). 


The above method can be extended to reduce any polygon enclosed by an 
odd number of hexagons and a pentagon. 

The next two reductions are much simplified if we reduce symmetrically 
by joining the free vertices of the two pentagons across the ring. Unfor- 
tunately, however, the two new polygons resulting may both have more than 
6 sides, even after A is applied. 


J. 565666 (or 565556). See fig. 5. 


(1) d=2, e=3, 34 
(2) d=2, e=—3, 9g =3,i—4 


Wytohorf 
12) toc b= 4 u=—?2 ja = 21 


43 etob cd = 21 u=lor2 e==3 
(1) or (2) 


(4) d=3, e=2, 34 


a 
jf 
| 
L 2 
fl lp 
Fig. 5. 
(3) d=2,¢—4 


C. E. WINN. 


43 b 
(24 etot 


ja=21 


fgh = 313 


u=1) 


e=4 
1=3;d,g =83 or4 


(6) 
(2) 4, 2, 


23etogorb f=4orcd=—41 u= e=3,1=—=2 or 3 
| u=1 
(8) d=4,e—2,g—i—4 
431 to b ja = 21 
(42 dtoj7 abc = 313 “= de = 24, g =4 or 2 
u=1) 1=3;d,g=—4 or3 
(9) d=4, e=3, gi 32 
(10) 4, em 3, gan8, im? 
32 etob cd == 41 
(42 ctoa b=1 | or 4 
u=1) €=23 or 3] 
(6) or (7) | 
(11) d=3, e—4, 941i org=i=3 
42 etoa bcd = 131 e=2 
(4) or (5) 
(12) d=3, 
42etoa bed = 131 
(23 atoc b=4 == 2 a==3; 9,1—2 or 3 


€=23 9,1=2 or 4| 


522 
(5) 
== 
— 
| 
{ 

i 
u=1 | 

i 

{ 


Or 
©o 


A CASE OF COLORATION IN THE FOUR COLOR PROBLEM. 


(13) d=3,e—4,g=i=4 


| 43 i tod ja == 21 

(14aorctoiti j=3,b=20r3 a=c=—4 

43itoa,gorej=1,h=2 or 

h =m f == u=4 t= 3 

32 dtoborj c=la=4orl d == 2 

347i toe h == u=4 i=4;a,g—4o0r3 
u=4) {== 3 
(9) or (11) 


K. 566566. See fig. 6. 


(1) cme=—2 


(2) cm 2, 3, 42 1 


(3) cm 2, g = 4,1=—2 


32 etoc d=4 2 


(4) 


1* 
a j 

Fig. 6. 


524 


C. E. WINN. 


(5) c=3, e=—4, g =i (g =i = 4 equivalent) 


43 etoc d=2 
[42 etoa be = 31 u=1 
42 f=—h=3 
{23 a tot j=4 
(43; toc,h,f ab = 12, 1—1, 
g=i=!1 u=3 j=3 
u=3) a=3,c=3 or 2 
u= 3} e=2,9 =2 or 4 
U= 2] e=3 
equivalent of (2) or (3) 
(6) c=3,e—2,9 At org=—i—2 
23 (ora) tocd (or b) =4 C= 2 
(1) 
(7) 
24 etoa bed = 313 
tod c=2 
{31 d tof em 4 | 
(34dtobori c=1l;a—2o0rl u= | 
34 dtog de=43,g—4 u—2 de = 43 | 
32etoc d=1 “= 
u == 1) d==1, b 1 or 3 
u= 1} | 
34btog d= 4 u= b= 4, 1=3 or 4 | 
u=1] == 4 
(8) c—3,e—2, g—1—4 
42 etoa bed = 313 
{43 b tot aj = 12 1 
(d to g) | 
43 b tod 
(41 toj e=3 bed = 141 
equivalent to (7) ) b=4,g=—30r4 | 
u=—=1} e=4;9,1—40r2 | 
(4) or (5) | 


| 
| 

| 


A CASE OF COLORATION IN THE FOUR COLOR PROBLEM. 52a 


L. 565656. See fig. 7. 


(1) a=1, deg ~ 234 u=1 
(2) a= 
24dtobort c—3,a=—lor3 u=—2 d=4,g—4or2 
(4) a=3, de = 32, g #4 u==] 
(5) a=3, de = 23, g = 4 
23 d to b c=4 l= d=3 
(6) a=3, de = 24 
23 d to b c=4 u=Ror4 d=3 
i= 
(7) a==3, de— 42, 9 43 
(8) a= 
34 g tod ef == 12 g =4,a=—3 or 4 
2 
(9) a= 3, de— 43, 97 ~2 u=1 
(10) de— 43, g =2 


42dtobort1 c=—3,a=—lor3 u=2 d=2,g9 =2or4 
(4) or (5) 


‘ 

1 

Fig. %. 


C. E. WINN. 


We proceed to color a map S which may contain one polygon z of more 
than 6 sides—otherwise we let z be any polygon of S. 

First we apply the reduction A to any polygon of less than 5 sides. The 
removal of any side plainly leaves us with a map of type S; and an exhaustion 
of the process leads either to a map of 3 digons, which is colorable, or to one 
free of A polygons. 

Next, B is used in conjunction, if necessary, with A. The suppression 
of the configuration flanking a B ring again leaves a map of type S, as the 
only new polygon has at most 5 sides. It remains for us to examine the other 
reductions already explained for a 4 or 5-ring, concerning which we shall 
justify the following assumptions: 


(1) Any 3 polygons abc of a 4-ring F have at least 2 free vertices on either 
side of R, and therefore at most 4 when R is composed of pentagons and 
hexagons only. 

For, if abe have one vertex only on one side of R, the fourth polygon d has 
one or more vertices on this side. In the former case R encloses 2 polygons with 
a total of 8 vertices (unless they form a 2-ring). In the latter case we get a 
3-ring with d. Further, if abe have no vertex on one side of R, they touch a 
quadrilateral or a polygon making a 2-ring with d. Consequently, one of the 
previous reductions is applicable. 


(2) Any 4 polygons of a B 5-ring F& have together at least 3 vertices on 
either side of R, and so 5 at most when FR is composed of pentagons and 
hexagons only. 

For, if not, we should get in the same way a ring of 4 or less, or else 
2 polygons with a total of 9 vertices or 3 with a total of 13 or 14, according 
as all or only two of the latter touch the fifth polygon of R. 


(3) If x occurs in a B 5-ring at a, the other polygons bcde possess at most 
4 free vertices on either side of R, unless both b and e have a free vertex on 
each side of R. 

Otherwise, we can replace b or e by the polygon touching abc or dea. It 
is easily verified from a figure that the new ring has at most 4 vertices on 
either side, and on account of (2) surrounds more than one polygon. 


Now consider the reduction of a 4-ring composed of abcd which is made 


by uniting a and c into a new polygon y. If a or c=z, there is clearly no 

gain of vertices except for x. Otherwise, if z still occurs in R, let b =z. 
We see that y = 5 or 6 unless a, c have 3 or 4 vertices outside R (i.e. on 

the unreduced side), in which case y=7 or 8. In view of (1), however, 


526 
d 
d 
el 
a 
a 
r 
Vi 
0 
b 
0 
t 
0 
4 d 
9 
4 
ia 
4 
ff 
| 
t 


A CASE OF COLORATION IN THE FOUR COLOR PROBLEM. 527 


d then possesses one or no free vertex outside R, and so becomes a triangle or 
digon on the further reduction of which we get y= 6, as required. 

Let us pass to the other reductions of a B ring composed of abcde, first 
supposing a=. As the fusion of a with c or d will yield no gain of vertices 
elsewhere, we need only deal with the two figures formed by uniting bd (or ce) 
and be. 

In the former case the new polygon y will have r+ 5 vertices, where 6 
and d have together r free vertices outside R. Thus y>6 only when 
r= 2, 3 or 4. 

When r= 2, we can further reduce ¢ (<5) so as to diminish” the 
vertices of y to less than 7. 

When r = 3, since c, e have, according to (2), at most 2 free vertices 
outside R, either c << 5, e << 5 or ¢=2. In each case their further reduction 
brings down the number of vertices of y by 2 to less than 7. 

When + = 4, it follows from (3) that c, e have no free vertices outside R, 
so that their further reduction diminishes the vertices of y by 3 to less than 7. 

The proof for the reduction of a 5-ring in which b and e are united, runs 
on similar lines, use being made of the further reduction of ¢ and d, when 
necessary. 

Finally, if 2 does not occur in R, we can no longer appeal to (3). But 
the above proof holds for r= 2 or 3 without change, while the further re- 
duction of a yields the required result when r= 4. Moreover, as no polygon 
of R has more than 6 sides, a single reduction suffices, the other cases being 
derived by symmetrical or cyclic interchange. 

When the reductions A and B are no longer possible, we obtain a map 
of 3 digons or one of type S for which any of the other reductions quoted or 
proved are available without the danger of an isthmus appearing in the reduced 
figure. In each case we shall see that the resulting map is of type S. We 
apply these reductions in the following order: 


(1) D, E, F, G, H, J or K to a polygon wu connected with « by the com- 
mon side of two hexagons. In each reduction the unreduced sides of the two 
hexagons are separate, so that, except in E, wu and x form a single polygon. 


There is consequently no gain of vertices elsewhere. 


(2) D, E, F, H, J, K or L to a polygon connected with x by the common 


“Tf c has 4 sides left, the previous reductions of 2-rings obviates the creation of 
an isthmus on the further reduction of c. The same is true of other quadrilaterals 
to be reduced later. 


5 
} 


528 C. E. WINN. 


side of a hexagon and a pentagon. By employing one of the reduced figures or 
its symmetrical or cyclic equivalent we again ensure that u and z be united in 
the reduction, except possibly in the case E. Thus in F, if 6 =2, we join 
the free vertex of the pentagon to those of the other adjacent hexagon, 
Similarly, a cyclic adjustment of fig. 7 for L may be necessary to make z 
coalesce with u, the other new polygon having at most 6 vertices. 

As regards E, if c, e or g=z2, then a, being a hexagon, is reduced toa 
digon. - Consequently, the new polygon comprising u, which has not more than 
8 vertices, retains at most 6 when the digon is reduced. Again, if b (or h) =z2 
and c = 5, the new polygon including c has at most 8 vertices, which we can 
diminish to 6 or less by further reducing d or f. On the other hand, if b=z 
and c = 6, we apply instead D or H to the hexagon j, which touches c and 1 in 


common with z. 


(3) C to z, when surrourided by pentagons only. The reduction is seen 
to produce no gain of vertices except for 2. 

Since one or other of the above configurations is always present, we have 
shown how to reduce S to a map of the same type, and hence to obtain the 


required coloration. 


EGYPTIAN UNIVERSITY, CAIRO. 


( 
of 
of 
(1 
has 
vie 
of 
(1 
Ri 
pre 
orc 
AY 
bre 
cul 
| 
(2 
wh 
Th 
wit 
Vom 
Su 
(3) 
} 
den 
vol, 
Jou 


seen 


lave 
the 


ON THE FUNDAMENTAL GROUP OF A CERTAIN CLASS OF 
PLANE ALGEBRAIC CURVES.* 


By W. S. Turpin. 


1. Introduction. The problem of existence of algebraic functions, z, 
of two independent variables, x and y, possessing a preassigned branch curve 
of order n 
(1) fn(2, y) =0 


has been considered by Enriques’ and Zariski.? Zariski has shown that, in 
view of a result of Enriques, this question may be reduced to the consideration 
of the Poincaré (fundamental) group of the residual space of the branch curve 
(1) relative to its carrying complex projective plane and the application of the 
Riemann existence theorem for algebraic functions of one variable having 
preassigned branch points. 

It is sufficient for the theory of algebraic surfaces from the point of view 
of birational transformations to consider branch curves (1) possessing only 
wrdinary double points and cusps. Zariski has shown that if a curve possesses 
mly ordinary double points then its fundamental group is necessarily cyclic. 
A simple case of a curve whose fundamental group is not cyclic is that of the 
branch curve of a cubic surface. If the cubic surface is general, its branch 
curve is a sextic, fg, with six cusps on a conic: 

(2) fe [¢s(x, y) ]? + y) = 9, 

where ¢, and y. are polynomials in x and y of respective degrees 3 and 2. 
This curve was treated in detail by Zariski and its fundamental group specifi- 
cally determined. 

An obvious generalization of the curve (2) is the curve fem, of order 6m, 
with 6m cusps at the intersections of two curves, ¢gm(z%,y) =0O and 
Vom(x, y) == 0, of orders 3m and 2m respectively, where m is a positive integer. 


Such a curve is given by the equation: 


(3) fom [dsm (2, y) |? [Wom (2, y) = 0. 


* Received December 1, 1936. 
1F, Enriques, “Sulla costruzione delle funzioni algebriche di due variabili posse- 
denti una data curva di diramazione,” Annali di matematica pura ed applicata, ser. 4, 
vol. 1 (Nov., 1923), pp. 185-198. 
*0. Zariski, “On the problem of the existence of algebraic functions,” American 
Journal of Mathematics, vol. 51 (1929), pp. 305-328. 
529 


es or 

join 

gon. 

ke 

to a 

han 

can 


530 W. S. TURPIN. 


The methods of investigation used for the fundamental group of (2) are 
peculiar to the sextic of this type and do not admit of an extension to the more 
general class of curves (3). Hence, it was deemed of interest to investigate 
the structure of the fundamental group of curves of the type (3). One may 
expect that the methods developed in this investigation may point the way 
to a possible procedure for other types of curves, for instance for the branch 
curves of general surfaces of any order in 8;. As it is known, these curves 
have been completely characterized by B. Segre.* 

This investigation of the structure of the fundamental group falls under 
three classifications : 


1°. The determination of the fundamental group, G, of a degenerate 
limit curve, f, of curves (3). 

2°. The factorization of the relations of @ into relations belonging 
formally to the fundamental group G’ of a virtual curve f’ with 6m? cusps, 
of which f is a limit curve. 

3°. Verification of the fact that G’—=G, where G denotes the funda- 
mental group of a curve f of type (3). 


Our method of attacking the problem in our special case contains the nucleus 
of a perfectly general procedure, applicable to an arbitrary plane curve with 
nodes and cusps, provided the complete continuous (irreducible) system {f} 
of curves having the same singularities as f contains some special curve f, for 
instance a degenerate curve without multiple components, whose fundamental 
group can be directly determined. However, to date, a method has not been 
found for the step 3° of the above procedure that does not appeal to the special 
geometry of the curves f. This verification is necessary due to the fact that 
the factorization obtained in 2° is not unique. It seems probable that an 
equivalence of possible factorizations can be established by purely group theo- 
retic considerations, but, efforts in this direction have not been successful up 
to the present. 


2. General properties of the fundamental group G and its associated 
group 7.* Consider the curve f determined by the equation f(z, y) = 0 where 
f(x,y) is a polynomial, of degree n, in the complex variables z and y. 


* B. Segre, “Sulla caratterizzazione delle curve di diramazione nei piani multipli 
generali,” Mem. Accad. Ital., Mat., vol. 1 (1930). 

“The concepts and results in this section are compiled from the following sources: 

S. Lefschetz, “ Topology,” American Mathematical Society Colloquium Publications, 
vol. 12 (1930) ; E. R. van Kampen, “On the fundamental group of an algebraic curve,” 
American Journal of Mathematics, vol. 55 (1933); O. Veblen, “ Analysis situs,” Amer'- 
can Mathematical Society Colloquium Publications, vol. 5 (1922); O. Zariski, “On the 


| 

i 

fi 

i 


FUNDAMENTAL GROUP OF A CLASS OF PLANE ALGEBRAIC CURVES. 531 


We shall be interested in the Poincaré group, G, of the residual space, 9, 
of the curve f relative to its carrying complex projective plane (2, y). 

Let us suppose that the codrdinate axes have been chosen in such a manner 
that f does not pass through the point at infinity on the y-axis and that it 
possesses no multiple components. A generic line of the pencil {2 = const.} 
will thus have n distinct intersections with the curve f. Moreover, lines of 
this pencil having less than n distinct intersections with f are finite in number. 
We shall call such lines singular lines of the pencil and denote them by « = @; 


(‘=1,2,---,v). Denote by a a point distinct from the set [;]iz1,2,...,v 
and let [8 Jia,2,..., v be a set of non-intersecting loops in the plane of the 


variable z emanating from « = 4% and surrounding respectively the points of 
the set [@¢]i=1,2,..., vy Let be the roots of 
f(%,y) = 0 and choose loops g,; in the plane of the complex variable y, 
emanating from the point at infinity and surrounding bz, in such a manner 
that g; and g; have only the point at infinity in common fori 7. We denote 
this base point of the loops gx by O. 

If we define the Poincaré group G of f in the usual manner, it is well 
known that the loops g; may be taken as generators of G and that @ is 
independent of the choice of O to within an isomorphism. It is evident that 


the generators g;, satisfy the following relation: 


(4) 9igz2° = 1. 


As x traverses a closed path in its plane, starting from and returning to %, 
the set of points [b;] will move continuously describing certain paths in the 
y: plane and returning finally to its original. position, although the individual 
points may have been permuted among themselves. As the points b,; move, 
their corresponding loops g;, will also vary and this variation is completely 
determined by the motion of the points b,; and the condition that the set [gx] 
should always consist of non-intersecting loops. 

If, under cyclical variation of 2 from a, a root y traverses a path from 
b; to b;, the loop gi is transformed into a loop g’; surrounding 6b; alone and 
which must therefore be a transform of g; by some element of G, Moreover, 
gi and g’; are equivalent elements of G and, thus, corresponding to every 
cyclical variation of x from a, we obtain a relation between the elements of G. 
In particular, if 2 traverses the loops [Si Jinn,..., v we obtain motions of the 
roots y, which yield the relations 


Problem of existence of algebraic functions,” loc. cit.; O. Zariski, “On the Poincaré 
group of rational plane curves.” American Journal of Mathematics, vol. 58 (1936). 


are 
ore 
ate 
nay 
vay 
ich 
ves 
er 
ate 
ng 
ps, 
a- 
us 
th 
or 
al 
en 
al 
at 
n 
O- 
p 
d 
8, 
” 
e 


532 W. S. TURPIN. 


k=1,2,°°- 
t= 1,2, 


where {71,i, Pn,i} is a permutation of {1,2,---,n}. It has been 
shown that the relations (5) together with the relation (4) constitute a com- 
plete set of generating relations for the group G. 

The class of motions of the set of points [b,] induced by cyclic variation 
of x incident to the point a which carry this set into its original position, 


y:plane 


Fig. 1. 


in such a way that the roots y, remain distinct during the motion, constitutes 
a group of motions 7. Two such motions, m, and m2, correspond to the same 
element of T if the motion m, may be deformed into the motion mz by suitable 
deformation of the path of z so that, during the latter deformation, the induced 
motions keep the roots y;, distinct. 

Let (j =1,2,- -,n—1) denote an oriented arc from b; to bj. of 
such a type that the adjacent arcs have only an end point in common and non- 
adjacent arcs do not intersect (see Fig. 1). Let T; (j =1,2,:-°,n—}!) 


p 
h 
m 
P 
9; 
b. 
b 
b 
tic 
is 
rij 
ed 
ho 
tic 
ri 
in 
as 


FUNDAMENTAL GROUP OF A CLASS OF PLANE ALGEBRAIC CURVES. 533 


denote a motion in which the points b; (1j,j-+1) are fixed while the 
points b;, b;,, are interchanged, y; moving from 6; to bj,, along the right- 
hand edge of o; and yj,; moving from 6;,, to bj along the opposite edge. Any 
motion, m, belonging to 7’ can be deformed into a motion, m’, which may be 
expressed as a product of the elementary motions 7;. The deformation of 
m into m’ affects not only the paths but also the velocities of the points b,. 


The elementary motions 7’; satisfy the relations: 
(6) |t—j|¥A1 
(7) = Tia Til ist. 


y:plane 
Fig. 2. 


The commutative relation, (6), holds due to the fact that, under the assump- 
tion |i —j | ~A1, G and G; have nothing in common and the motions 7; and 
1’; are thus independent of the order of their performance. The proof of (7) 
is as follows: Introduce the motion 7* sending yi,. from bj,. to b; along the 
right-hand edge of the oriented arc e* and y; from b; to b;,. along its opposite 
edge, where &* is chosen as indicated in Fig. 2. Then the following equalities 
hold; = = Ti T*. Solving these for and equating the solu- 
tions, we have that PinlTiT?, =7;°7;,,T;:. Multiplying this relation on the 
right by 7;,, and on the left by 7, we obtain the desired result. 

If we now choose g, to be any loop on O surrounding b, which does not 
intersect the arcs G2, * *, Gn, and let g; be the loop into which g;_, is deformed 
a8 moves from b;_, to b; along the arc for (1 = 2,: --,n), then the 

? 


| 
ka 


534 W. S. TURPIN. 


motion 7’; induces a transformation of the following type on the set of 
loops [gx]: 

( 
(8) = Gin 

If we denote the complement of the set of points [i ]i-:,2,...,v relative 
to the plane of the complex variable « by C[a;], then, as x describes a cyclic 
path incident to % in C[a;], the points [b;] undergo a motion belonging to 
the group 7’, the loops gx are subjected to the corresponding transformation 
which, in turn, yields a generating relation of the group G. In particular, 
the motions 7's, generated as x describes the loops 8’; induce transformations 
which yield the relations (5) for G. In consequence, a knowledge of the 
motions 7’; is sufficient to determine the structure of G.. 

It will be useful to examine the motion induced on the roots y as z 
describes a loop about a singular value corresponding respectively to a tangent, 
ordinary double point and cusp of f. 


(A) Suppose z = a; is a simple tangent to the curve f and that y, = b,, 
Y2 = b. are the two roots of f(a, y) 0 which tend towards coincidence as 
xz—>a;. Then, as x describes 8’;, the motion of the roots y, is equivalent to 
one in which y; is fixed for 7 > 2 and the motion of y, and yz is typified by 


that of the roots of y? = as x describes the loop | «| 1. This motion has 


the form 7,. If we do not make the simplifying assumption to the effect that 
the two roots which approach coincidence as «> a; have consecutive indices, 


7) 


the corresponding motion will have the form 717,77 where T is a product of 


elementary motions Tj. 


(B) Suppose 2 =a; is a singular value corresponding to an orrdinary 
double point of f. Then the motion of the roots induced as zx describes 8’; is 
equivalent to 7',*, or, in the general case, to 7-17,°T. Such a singular value 
a; may be considered as the limit of two singular values a;’ and «,”, corre- 
sponding to simple tangents of f, at each of which the same pair of roots is 


permuted. 


(C) Suppose =; is a singular value corresponding to a cusp of f. 
Then the motion of the roots induced as x describes 8’; is equivalent to 7,’, 
or, in the general case, to T-!7',°7. In this case, the singular value a; can be 
considered as the limit of three singular values corresponding to simple tan- 


gents which have approached coincidence and at all of which the same pair of 
roots is permuted. 


va 


whe 
res) 
anc 
the 
Pam 
of 

crit 


cor) 


col 
isO 
evi 
po 
su 
col 
ge 
rel 
wh 
| for 
ab 
ap] 
In 
of 
gr 
pos 
of : 
2TO 
(3) 


FUNDAMENTAL GROUP OF A CLASS OF PLANE ALGEBRAIC CURVES. 535 


The relations arising between the generators g; of G due to singular 
values of types (A), (B) and (C) are, respectively, of the following forms: 


(a) (b) 9192 = 9291 (C) = 929192. 


Let us now consider a variable curve f and let f denote a limit curve under 
continuous variation of f. If f and f have the same singularities, they are 
isotopic and, accordingly, possess the same fundamental group. Suppose, how- 
ever, that as f tends towards f it acquires new multiple points or multiple 
points of higher order. For the sake of simplicity and definiteness, let us 
suppose that as f—>f the simple singular values #, and a of f approach 
coincidence in « so that f acquires a new ordinary double point. Any 
generating relation for f corresponding to a singular value distinct from @,, % 
remains a true relation for f. The relations for f relative to a, and a, viz. 
9: = ge, are destroyed for f since, in the limit when @,, a — a, both the loops 
§’, and &, pass through a On the other hand, the relation g,g2 = g2g, for f 
which arises from a circuit, 8’, of « about both a, and a, is not destroyed 
for f since, in the limit, it becomes the relation corresponding to a circuit of x 
about the singular value « for f. This reasoning is perfectly general and 
applies to any new multiple point, or point of higher order, acquired by f. 
In fact, the following theorem has been established : 

All generating relations of the fundamental group G of a limit curve f 
of a variable curve f are also true generating relations for the fundamental 
group G of f. This theorem holds even if f is degenerate provided that it 


possesses no multiple components. 


3. Consideration of the motions, 7’, occurring at the singular values 
of a degenerate case of f,,,(7,y). We wish to investigate the fundamental 
group of curves of the type 


(3) fom(, = [gam y) + [Yem(a, y) = 0 


Where 3m and Wom are polynomials in the two complex variables « and y of 
respective degrees 3m and 2m. It is supposed that the intersections of d3m = 0 
and Yon 0 are distinct. If the curves and yo», are general, 
the curve fgm—0 will possess 6m? cusps at the intersections of the curves 
= 0 and Yo» and no further singular points. For a general choice 
of codrdinate axes, fm will possess, under these restrictions, 6m? distinct 
critical values corresponding to cusps and 6m(3m —1) distinct critical values 


corresponding to tangents. 


536 W. S. TURPIN. 

We postpone the consideration of (3) temporarily and consider a de- 
generate case of a curve of this type, namely, 
(9) fom(z, y) = — = 0. 


The critical values for this function are z—0 and =o. The singularity 
corresponding to each of these values consists of 3m branches having simple 
contact and vertical branch tangent. We now proceed to ascertain the motion 


0, 


0, ) 
X: plane 


Fig. 3. 


T’ induced as x describes a loop about the critical value z= 0. Let us choose 

= a, to be x = 1 and as 8, choose | x | — 1. Now, the roots of f’em(0, y) = 
which we denote by bx, are the 6m-th roots of unity. Let us select loops gi 1 
the y: plane as indicated in Fig. 3. Then as z describes the loop 81, a motion 
T’ is set up which sends b; into the diametrically opposite point Dems along 


the arc bibem.i. Let 7; denote the elementary motion corresponding to the 
are G; of Fig. 3. Then the motion 7” may be expressed in terms of the 
elementary motions 7; as follows: 


t 
a 
g 
FN 
g V 
lvl=1 
a 
W 
! 
4 / 
(> ) / t 
i 
lane p 
( 


FUNDAMENTAL GROUP OF A CLASS OF PLANE ALGEBRAIC CURVES. 537 


(10) T’ = (T om-1T em-2 


Since the loop &; may also be considered as a loop about z= o, the 
motion of the roots induced as x describes a loop about the latter critical point 
is again equivalent to 7”. 

4. Factorization of the motion 7” into elementary motions belonging 
to virtual cusps and simple tangents. The curve (9) is a member of the con- 
tinuous subsystem of the system {fom} consisting of the curves 


(11) om : — (x —a)*" (4 — 
and is isotopic to the general curve of this subsystem. In fact, the equations 


x — tb 

re t+1i—r(1+a) 
y 

+1—7(1 +4) 


where 7 is a parameter, define for each value of 7 such that 


rh—r(1 +a) +140, 


(12) 


a non-degenerate collineation z; in the projective (x,y) plane which carries 
the curve (9) into a curve of the system {f*gm}. Moreover, 7 is the identity 
while 7,(f*om) is the curve (9). 


The curve f*om is a limit curve of the general curve, fom, and we have, 


in the pencil {2 = const.}, only two singular lines for f*gm, «=a and r= b. 
These singular lines each absorb a certain number of singular lines with respect 


to the general curve, fem, and since the singular lines c =a and « = b can be 
interchanged by a continuous variation of the curve f*om in the system (11), 
it follows that each of these lines absorbs 3m? lines x = const. passing through 
cusps of fem and 3m(3m—1) simple tangents of fem. It must therefore be 
possible to factor the motion 7” into a product of motions T;, 6m? of which 
correspond to cusps and the remaining 6m(3m—1) of which correspond to 
simple tangents. We proceed to exhibit formally one such factorization of T”, 
making use of the relations (6) and (7), and we shall show afterwards that 
this factorization actually belongs to the curve ftom considered as a limiting 
case of the general curve of the system {fem}. 
We first write 7’? in the form 


It will be useful to establish two lemmas. 


y 

e 

), 

n 

le 


538 W. S. TURPIN. 


Lemma 1. 7575-7 5-1) = (TyrTj-2) jT jsTj-2. 
Proof. Making use of an alternate form of (7) we have 
and in virtue of (6) 
Repeating the first process, we obtain 


which is the desired result. 
LEMMA 2. (75 T 5-1)? = 
Proof. Using (7), we may write 


and making use of an alternate form of (7) we are enabled to write the last 
member in the desired form. 
Let us now write in the following manner: 


{[Tem-1 T;| [ (T em-1 em-2) (T em-s)7* 
ar (Tem-1T em-z)? (T em-s)? 
(T em-1T em-2) (T em-s) 


This arrangement is possible due to the fact that the center bracket reduces to 
the identity in virtue of the relation, (6). If we again make use of (6), this 


expression may be rearranged as follows: 


{[Tom-1T om-2T om-s om-sT (Tem-4T om-s)* 
| | om-2) *T om-1 em-2( T ¢m-sT om-s) om-s 


If we perform the obvious cancellations and apply Lemma 1 to the last bracket, 
we obtain: 


{[Toem-1T om-2T em-s(T em-1T em-2)~ T5147 3(TsT4)*| [ (Tem-1Tem-2)? 


and, applying Lemma 2 to the elements of the middle bracket, we are thus 


rye 


enabled to write in the form 


r 

I 

é 

t 

\ 

¢ 

8 


FUNDAMENTAL GROUP OF A CLASS OF PLANE ALGEBRAIC CURVES. 539 


(14) T? = {| T em-11 em-21 em-2) 3(T'51 
[ om-1 ) ( Tem-s T em-4 ) T,* | 


which is the desired type of factorization. 


5. A function f*,,, of type (3), whose critical values correspond to 
induced motions given by the factors of (14). Consider the curve 
The critical values of the function Fen(z,y) are divided into two classes 
according to the following classification : 
=o) gm (7 =0,1,- -,38m—1), where w/3m denote the 3m-th 
roots of unity. Each of these critical values corresponds to a line 


containing 2m cusps of the curve F’gm = 0, each having a vertical cusp 


tangent. 
Q [1/6 (—1)* +8 
2. f= =| |1/6m . | ix ( ) | 
12m 
for k =0,1 and 1=0,1,- - -,3m—J1. Each of these values corre- 


sponds to a flex tangent having contact of order 2m—41 with the 


curve = 0. 


Let us choose « = 0 as common origin of loops 8’;, &;: in the x: plane selected 


as indicated in Fig. 4. The roots yz = b, of Fem(0, y) =0 are given by: 


. (64— 
[ ix ) 


6m 
— | 9 [1/2m. 
= | 2 | 
woe 
6m 
for 7==0,1,- -,2m-—-1. We now choose oriented ares as indicated in 


the diagram of the y: plane of Fig. 4 and choose loops g;, surrounding Dz, 
in the manner outlined in See. 4. Let us examine the motion induced on the 
roots y, as « describes one of the loops & in the v: plane. This examination 


will be somewhat simplified if we also consider the auxiliary curve 


(16) Y? == X? 


obtained from the curve (2, y) = 0 by setting Y = and X = 
The critical values of the function Fgn(x,y) are evidently divided into three 
classes: the values a; corresponding to the cusp of (16), the values ¢o,1 corre- 


sponding to the ordinary value ¥ 7 and the values c,,; corresponding to the 


2 


540 W. S. TURPIN. 


ordinary value X =—v71. The initial value, = 0, of the loops 8’ corresponds 
to the ordinary value XY —1. It is therefore clear that as z describes a loop 38’;, 
X describes a loop A, and, as x describes a loop 81, X describes a loop A, 
where A, Ao, A, are loops in the X: plane emanating from XY —1 and sur- 
rounding respectively the values XY — 0,1, —17. If we denote the roots Y» of 
(16) for by B, where By = e[ (21/3) (n — 2)]. (n —1, 2, 3), then, 
clearly, the image of the points bn,,; in the Y: plane is the point B, for all 
values of j. Let us define elementary motions 7; of the points 6, with reference 


Fig. 42. 


to the oriented arcs o, of Fig. 4, and elementary motions T*; of the points By 
with reference to ares 3; chosen as positively oriented arcs of the unit circle 
joining B; to for i=1,2. As describes the loop X describes the 
loop A and, therefore, the points B, undergo the motion (7*,7*,)*. It there- 
fore follows that the image points, bn,;;, in the y: plane are subjected to the 
motion (7»,3;7143;)” for all values of 7. Thus, as x describes the loop 4, 
we have that the corresponding motion of the points b; is given by 


(17) (T.T,)?(T'sT 4)? (Tem-1T' om-2)? 


for s=0,1,-- -,2m—41. 


or C, 
| 8, 
| 
x:plane 
Ye. 
( 
cu 


FUNDAMENTAL GROUP OF A CLASS OF PLANE ALGEBRAIC CURVES. 541 


Suppose « describes the loop 8%; then X describes the loop A;. Since 
A, does not surround a critical value of the function (16), it follows that the 
motion 7* of the points By is the identity. Consequently, the motion of the 
points b, must leave the set [bn.s;](;) invariant for each n. This motion is 
due to the flex line r= cys of the pencil {« —const.} and interchanges the 


points b;3-2x)+s; in a cyclical manner. This motion has the following descrip- 
tion in terms of the elementary motions 7; : 


0, 


lyl=1 


y:plane g, 


2m-2 


where “| indicates multiplication on the right. 


For the curve F'om(z, y) = 0, certain of the singular lines of the pencil 
{t = const.} are coincident. These coincidences are not intrinsic and are due 
to the special choice of the codrdinate axes. In order to eliminate these 
coincidences from the considerations, we proceed in the following manner. 

If we denote by /’sn(6) the function obtained by rotating F'gm through a 
positive angle 6, then, for small values of 0, the curve F'gm(@) =O will have 
cusps near to the cusps of Fam = 0; however, the cusp tangents will no longer 


| 

1 

b, €4 

b &) 
Fig. 4y. 


542 W. S. TURPIN. 


be vertical. In fact, the critical value xa; of Fem has associated with it 
critical values «dj, corrresponding to cusps of and values 
xz = dj, corresponding to simple tangents of the curve F'gm(@) = 0, where aj, 
and dj, are in the vicinity of aj, for k =1,2,- - -,2m. 

When a singular line of the pencil {~ = const.}, which passes through a 
cusp having this singular line as cusp tangent, breaks up into a pair of singular 
lines, one of which is a simple tangent and the other a line through the cusp, 
the motion (7's+17',)* breaks up into either the product of 7*.,; and DE sT on 
or of T° and T? T's7's41, according as which of the two, essentially distinct, 
possible choices of loops in the x: plane is selected. If we are interested merely 
in the generating relations of the fundamental group and not in the actual 
motion of the roots, it is indifferent which of these choices is made, due to the 
fact that the resulting relations among the generators of the fundamental 


group are the same in both cases. 

The singular lines of the pencil {2 = const.} passing through flexes of 
Fem = 0 break up into distinct singular lines passing through points of simple 
tangency of Fgm(@) —0. Once more, there is ambiguity concerning the 
determination of the actual motions corresponding to these singular lines, the 
motions again being dependent on the choice of loops. However, the flex as a 
unit imposes the same relations on the generators gz, as does the corresponding 
number of distinct simple tangents which approach coincidence to form the 
flex. Consequently, for any choice of loops, the motions corresponding to these 
simple tangents must impose the same relations on the elements g;, as the 
factors of the flex motion (18) which correspond to simple tangents. Thus, 
from the standpoint of relations on the generators gx, it is sufficient to treat 
the set of singular lines which approach coincidence in a flex tangent as a unit 
and merely say that the imposed relations are those of the original flex, the 
individual motions corresponding to the tangents being left out of the con- 
sideration entirely. 

To summarize, F'gm(@) will have the following properties: 


1°. Possesses 6m? critical values aj; corresponding to cusps of the curve 
F'¢m(9) = 0 such that, for a proper choice of loops in the a: plane surrounding 
these values, the induced motions are (7'3x,,)* where k ~0,1,-°- -,2m—1 
and 7 =0,1,- --,3m—1. 

2°. Possesses 6m? critical values dj, corresponding to simple tangents 
of the curve F'gm(@) = 0, such that, for a proper choice of loops in the z: plane, 
the induced motions are for k and 
j=0,1,:--,3m—1. 

3°. Possesses 6m sets of 2m—1 critical values cx corresponding to 


an 
we 


wh 
(2 


In 


3 


FUNDAMENTAL GROUP OF A CLASS OF PLANE ALGEBRAIC C’ RVES. 543 


simple tangents of the curve F’om(@) 0 of such a type that, for a proper 
choice of loops in the z: plane, the induced motions relative to a set cxze for 
k and I fixed impose the same relations on the generators g; as do the motions 
(19 ) T oms2k- gmsok-3t-4l oms2k-3t-s ( T oms2k-3t-31 ) 

for t==0,1,- - -,2m—2, where k and J may assume any of the values 
k=0,1; /=0,1,---,3m—1. 

These properties are exactly those desired for the function f*gm and, 
accordingly, we define f*¢gm to be the function F’gm(@). 

6. Determination of the structure of the fundamental group (* of 
the curve f*on(x,y) =0. The component transformations of ¢’” associated 
with the motions corresponding to the critical values of the function f*gm(z, y) 
give rise to relations among the generators gx, of G* of the following types: 

1°. From the transformations associated with the motions (19), we 


obtain the relations 


and the relations 

(21. j) Jom-3i = 
where 7 = 1,2,- - -,2m—1. 


2°. The transformations t¢m_sj.. give the relations 
(22. 7) Jom-3j+1J == em-3j+2 
for j= 1, 2,- -, 2m. 
3°. The transformations 3js2 yleld the relations 
(23. Jom-3j+1 == Jom-3j+3 for j= 25 2m). 
In addition, the generators g; also satisfy the trivial relation 
(4) 9:92" °° 9n=1. 
If we now apply (23.7 -+ 1) to (20.7) we have 
Jom-3j-29 6m-3j-1J6m-3j-2 == Yom-3j-19 6m-3j-29 6m-3j+1 
and, on application of (22.7 -+ 1) to the left-hand member of this expression, 
we obtain 
== Jom-3j-196m-3j-29 om-3j+1 
whence 
(24) YJoem-3j-1 == Jom-3j+1 for j= l--:,2m—l1. 
In the same way, if we apply (23.7) to (21.7) we obtain 


== +29 6m-3j+1 


544 W. S. TURPIN. 


and, by application of (22.7) to the right-hand member, this expression 


reduces to 

em-3j+2 == em-3j+2 
whence, 
(25) Jom-3j = Jom-3j+2 for j °,2m—1. 


On combining the relations (23), (24) and (25) we obtain 


If we make use of the relations (26), we are enabled to eliminate the 
generators 93, Jem from the relation (4) obtaining 
(27) (gig2)°" = 1. 


Thus, G* may be generated by two elements g, and gp» satisfying the relations 
(22.2m) and (27). The relation (22. 2m) gives as a consequence the relation 


(9192)* = (9:9291)* and conversely. 
Let us define new elements U and V in the following manner: 


(28) U = 919291; V = 9i9e. 
Then the relations: 


together with the defining relations (28), give as consequences the relations 
(27) and (22.2m). Hence, it is possible to generate the fundamental group, 
G*, of the curve f*om(z, = 0 by the two elements U and V satisfying the 
relations (29). 


7. Conclusion. The curve f*gm——0 is of the same type as fem =0. 
Moreover, the number and kind of singularities are the same for both curves 
and the restrictive hypothesis to the effect that the cusps of fem(z,y) =? 
should be distinct is also satisfied for f*¢m(z, y) = 0. Consequently, it is clear 
that the curves are isotopic and, therefore, that their fundamental groups 
possess the same structure. Hence, we may conclude that the fundamental 


group G of the curves 


(3) fom( x,y) = y) + y) = 0 


may be generated by two elements U and V, of respective orders 2m and 3m, 
satisfying the relations U?V-* = U*™ 1. 


THE JOHNS HOPKINS UNIVERSITY, 
BALTIMORE, MARYLAND. 


p 
p 
2 
p 
| 
| 01 
fr 
0 
01 
li 
| 
to 
sy 
(2 
po 
he 
fr 
pe 
th 
isc 
| 


GEOMETRY OF TURBINES, FLAT FIELDS, AND DIFFERENTIAL 
EQUATIONS.* 


By Epwarp KAsner and JOHN DE CIcco. 


In this paper, we study the geometry of the oriented lineal elements of a 
plane. We give additional results to those found in a paper by the senior 
writer entitled “'The group of turns and slides and the geometry of turbines,” 
published in 1911 in the American Journal of Mathematics, vol. 33, pp. 193- 
202. The present paper, however, can be read independently of the earlier 
paper. 

We define «+ elements to be a series of elements; this includes a union 
or curve as a special case. Of course, «* elements form a field of elements, 
which corresponds to a differential equation of first order F(a, y, y’) =0. 
A turbine is the series, which is obtained by converting each element of an 
oriented circle into one having the same point and a direction making a fixed 
angle « with the original direction. A flat field is the field that is obtained 
from the totality of all elements which are determined by either the set of all 
oriented circles containing a given element, or as a special case, the set of all 
oriented lines, which are parallel to and possess the same orientation as a given 
line. We desire to study the relationships between general series, general fields 
(differential equations), turbines and flat fields. 

For the analytic representation of an element, it will be found convenient 
to use two systems of codrdinates, called the cartesian and hessian codrdinate 
systems respectively. The cartesian codrdinates of an element F are either 
(z,y,y’) or else (x, y,6), where (z,y) are the cartesian codrdinates of the 
point of the element and @ is the inclination of the line of the element. The 
hessian coordinates of an element are (uw, v,w) where v is the perpendicular 
from the’ origin to the line of the element FZ, wu is the angle between the 
perpendicular and the initial line, and w is the distance between the foot of 
the perpendicular and the point of the element. 

The final theorems constitute wide extensions of Scheffer’s’ theory of 
isogonal trajectories and equi-tangential trajectories, including his two dual 
theorems as very special cases. 

The main theorems in our paper are those numbered 8, 14, 15, 16, 19, 


* Received November 13, 1936; Revised December 18, 1936. 
* Mathematische Annalen, vol. 60 (1905), pp. 491-531. 
545 


| 


546 EDWARD KASNER AND JOHN DE CICCO. 


20, 23, 30, 33, 35, 36. The theory of conjugate differential equations (possess- 
ing the same ? osculating circles, Theorem 19) is noteworthy. 


The turbine. A turbine is the set of oriented elements which are obtained 
by converting each oriented element of an oriented circle into one having the 
same point and a direction making a fixed angle « with the original direction. 
We call the turbine non-linear or linear according as the circle is non-linear 
or linear. 

In cartesian coordinates, the equations of a non-linear turbine are 


x=a-+Rsin (6+ 2), y=b—Reoos (6+ 4); 
and in hessian coérdinates, the equations are 


v=acosu+ bsinu+r7, w=—asinu+bcosu+s, 
where 
r= cos a, sina. 


These equations show that the points of the elements of a turbine form 
a circle, which we call the outer circle of the turbine; and that the lines of the 
elements are the tangent lines of a circle, which we call the inner circle. 


In cartesian codrdinates, the equations of a linear turbine are 
xcosu-+ ysinu =v, 4+ 


where uw and v are constants and, in hessian codrdinates, the equations of a 


linear turbine are 
U =u—a- Vcosa+ Wsina—v. 
We obtain the following two theorems: 
THEOREM 1. Two elements determine a unique turbine. 


THEOREM 2. Two turbines possess either no common elements or one 


common element. 


If a number of elements are all on a turbine, we shall say that these ele- 
ments are co-turbinal. 

If a number of turbines all have one element in common, we shall say 
that these turbines are co-elemental. 

Let T be the turbine, which is obtained by converting each element of the 
oriented circle C into one having the same point and a direction making a fixed 
angle a with the original direction. Then the turbine NT is defined to be the 
conjugate turbine of T, if it is obtained by converting each element of the 


co 


po 


or 
TI 
Al 


wh 
nes 


In 


wh 


of 1 


a li 


or 


q 


GEOMETRY OF TURBINES. 54 


oriented circle C into one having the same point and a direction making a fixed 
angle — « with the original direction. (By means of a certain representation 
R of the elements of the plane by the points of space, studied in the paper of 
1911, conjugacy is defined as polarity with respect to a basic null-system NV). 

In hessian codrdinates, the conjugate turbine NT of the non-linear 
turbine 7’ 


v=acosu+ bsinu-+?r, w=—asinu-+beosu+s, 


v=acosu+ bsinu+r, w==—asinu+ bcosu—s. 
The conjugate turbine NT of the linear turbine 7 


U =u—a- Vcosa+ Wsina =v, 


=u+a-t V cosa— W sing =v. 

THEOREM 3. The conjugate turbines of two given turbines possess no 
common elements or one common element according as the two given turbines 
possess no common elements or one common element. 

The flat field. The totality of elements determined by the set of all 
oriented circles, which contain a given element, is called a non-linear flat field. 
The given element is called the center or the central element of the flat field. 
All the elements of the field are thus co-circular with a fixed (central) element. 

In cartesian codrdinates, the non-linear flat field is given by 


are tan —— 


at TT, 


where (a, B,y) are the cartesian codrdinates of the element which is the 
negative (or reverse) of the given element contained in the oriented circles. 


In hessian codrdinates the non-linear flat field is given by 
w= (v+b)tand(a—u) +6, 


where (a, b,c) are the hessian codrdinates of the element, which is the negative 


of the given element. 
The set of all elements obtained by setting wu = constant, say 2, is called 


a linear flat field. 
It is easy to prove the following theorems: 


TneorEM 4. Three elements determine a unique flat field. 
THeEorEM 5. Two flat fields have in common one and only one turbine. 


THEOREM 6. Three flat fields have in common one and only one element 
d 


or else they have a turbine in common. 


is 
is 


548 EDWARD KASNER AND JOHN DE CICCO. 


The envelopes of a one parameter family of series of elements. We 
define «* elements to be a series of elements. The points of the elements of 
a series form a curve which we call the point-curve of the series, and the lines 
of the elements of a series are the tangent lines of a curve which we call the 
line curve of the series. 

Now we consider a one parameter family of series of elements. Let us 
determine the envelope of the one parameter family of point-curves and the 
envelope of the one parameter family of line-curves of the one parameter 
family of series of elements. 

Now consider any particular series of the one parameter family of series. 
The point of intersection of the envelope of the one parameter family of point- 
curves and of the point-curve of this particular series belongs to an element 
of this particular series. This element is defined to be an element of the point- 
envelope of the one parameter family of series of elements. 

Again consider any particular series of the one parameter family of series. 
The common tangent line of the envelope of the one parameter family of line- 
curves and of the line-curve of this particular series contains an element of 
this particular series. This element is defined to be an element of the line 
envelope of the one parameter family of series of elements. - 

If the one parameter family of series is given in cartesian coordinates by 
the equations 

y = f(z, t), 6=g(z,t), 


where ¢ is the parameter, then the point-envelope is given by the equations 
y =f (2, t), (x,t), fr(z,t) =0. 


If the one parameter family of series is given in hessian codrdinates by the 
equations 
v= F(u,t), w= G(u,t), 


where ¢ is the parameter, then the line envelope is given by the equations 
v= F(u,t), w= (G(u, t), F;(u,t) =0. 


THEOREM 7%. For the point envelope and the line envelope of a one 
parameter family of series of elements to be identical, it is necessary and 
sufficient that either the one parameter family of series be a one parameter 
family of oriented curves; or, if the one parameter family of series is given 


in cartesian codrdinates by the equations 


wh 


be 


C06 


wh 


be 


equ 


and 


whe 


to b 


COO! 


and 


is id 


whe: 


to be 


; 

4 

| 

a 

é 

| 
H 

q 


GEOMETRY OF TURBINES. 549 


c=$(6,t), y=y(dt), 
where t 1s the parameter, thé eliminant with respect to 6 of the two equations 
=0, yr(6,t) =0, 
be identically zero, or if the one parameter family of series is given in hessian 
coordinates by the equations 
v=f(u,t), w=g(u,t), 
where t 18 the parameter, the eliminant with respect to u of the two equations 


fr(u, t) =0, gr(u, t) =0, 
be identically zero. 


If a family of series of elements is given in cartesian coérdinates by the 
equations 
«= t), = (4, t), 


and, if the eliminant with respect to 6 of the two equations 
$:(0,t) = 0, (6, 1) =0, 

is identically zero, we define the series, which is obtained from the equations 
c= (6,1), y=y(6,t), 

where 6 = 6(t¢) is the common solution of the two equations 
$:(9,1) = 0, 1) = 9, 


to be the envelope of the family of series of elements. 
If a one parameter family of series of elements is given in hessian 


coordinates by the equations 


v=f(ut), w=g(u,t), 


and if the eliminant with respect to wu of the two equations 
fr(u,t) =0, gt(u,t) = 9, 

is identically zero, we define the series, which is obtained from the equations 
y == f(u, t), w—=qg(u,t), 

where w= u(t) is the common solution of the two equations 
fr(u, t) = 0, gr(u, t) =0, 


to be the envelope of the one parameter family of series of elements. 


8 


EDWARD KASNER AND JOHN DE CICCO. 


Or 
or 


It is easy to prove that the above two definitions are equivalent. 


THEOREM 8. The necessary and sufficient condition that the one 


parameter family of turbines 
v=a(t)cosu+ b(t)sinu+r(t), w=——a(t)sinu-+ b(t)cosu + s(t) 


possess an envelope ts that 


a’? + 77? + 


Moreover the envelope is unique and it is given by the equations 


— a'r’ — a’s’ — b’r’ 
cos U = sin u = 
a’? +. a’? 5”? 
v=acosu+ bsinu+7, w=—asinu+ beosu+s. 


For the eliminant of the equations 
a’ cosu+ b’sinu+ 7 =0, —asinu+b’cossu+s =0, 


is obviously the above condition. The series is unique since cos w, sin U, v, w 
satisfy linear equations. The theorem follows. 
From Theorem 8 we obtain the following: 


THEOREM 9. The one parameter family of conjugate turbines does or 
does not possess an envelope according as the given one parameter family of 


turbines does or does not possess an envelope. 


THEOREM 10. The necessary and sufficient conditions that the one 
parameter family of turbines 


v =a(t)cosu + 6b(t)sinu+ r(t), w —=—a(t)sinu + b(t)cos u + s(t), 
be all co-elemental are 


The tangent turbines of a series of elements. Any series of elements, 
which has the property that consecutive elements are non-parallel, may be 
given in hessian codrdinates by the equations 


v=v(u), w=w(u). 


In what follows, we use hessian codrdinates. 
Let wu and uw + Aw determine the two elements of the series 


(u,v,w) and (w+ Au, v(w+ Au) =v Av, w(u -+ Au) =w-+ Av). 
Since these two elements cannot be parallel, they determine a unique non- 
linear turbine, which is given by the parameter values 


i is. 
| 
po! 
rac 
far 
| 
tur 
fin 
of 
lin 
all 
tur 


GEOMETRY OF TURBINES. 551 


1 
lea PG [— Av sin $(2u + Au) — Aw cos $(2u + Au) ], 
2 sin 
1 
2 sin 
r=v-+ gAv + cot $Au, 
s=w + — cot 


[Av cos $(2u + Aw) — Aw sin $(2u + Au) ], 


The limiting turbine (of the above set of turbines), as Aw approaches zero, 


is given by the parameter values 


a=—v'(u) sinu—w’(u) cos u, 
b =v'(u) cosu—w’(u) sin u, 


r=v(u) +wu’(u), 
s=—v'(u) + w(u). 


We call this turbine the tangent turbine of the series at the element (u, v, w).? 
It is found that the rate of turning with respect to the are length of the 
point-curve of the series of the element of the series is + 1/R where RF is the 


radius of the outer circle. 


THEOREM 11. The necessary and sufficient conditions that a one parameter 
family of turbines be a set of tangent turbines of a series of elements, are that 
the one parameter family of turbines be not a co-elemental family of turbines 
and possess an envelope. Moreover the envelope is the series to which the 


turbines are the tangent turbines. 


The proof of Theorem 11 follows immediately from,a consideration of the 
equations for the parameter values of the tangent turbines of a series. 


From Theorem 7 and Theorem 11 we obtain 


THEOREM 12. The necessary and sufficient conditions, that a single in- 
finitude of turbines be a set of tangent turbines to a series, are (1) that the set 
of turbines be not co-elemental and, (2) if the turbines are not all circles, the 
line and point-envelopes of the turbines be coincident and, if the turbines are 


all circles, the envelope of the circles consist of one curve. 


THEOREM 13. Jf a one parameter family of turbines is a set of tangent 
turbines, then the conjugate turbines are either a set of tangent turbines or a 


set of co-elemental turbines. 


If the series is a curve (union), the tangent turbine at the element E becomes 
the osculating circle of the curve. 


= 
= 


552 EDWARD KASNER AND JOHN DE CICCO. 


The osculating flat fields of series of elements. The flat field, which hag 
three consecutive elements in common with a series at an element F of the 
series, is called the osculating flat field of the series at FL. 

Let the flat field 

w=(v+b) tang(a—u) + ¢, 


be the osculating flat field of the series 
v= w=w(u), 


at the element F. Then if a, B, r, s are the parameter values of the tangent 
turbine of the series at H, we must have 


py 
ar? — a’s 
a”? 
b=—acosa-+ Bsina—r, c=—asina+Becosa+s. 


These equations show that the centers (central elements) of the osculating 
flat fields of the series are the elements of the envelope of the conjugate 


turbines. 


General fields of elements. The two parameter family of elements, which 
is given in cartesian coordinates by the equation 
6=9(2,y), 
or in hessian codrdinates by the equation 
w=f(u,v), 
where f is of period 2z in uw, is called a field of elements. 
Let the field be given in hessian codrdinates by w=f(u,v). Then, if 
v =v(w), the functions v = v(u), w = f(u, v(u)) define a series of elements, 
which we call a field series of the field. 
Now let a field series contain the element (u,v,w—=f(u,v)). The 


parameter values of the tangent turbine of this field series at the element 
(u, v,w =f(u,v)) are given by 

a=—v (sinu-+ cos wu) — fy cos u, 

b =v’ (cos u — f, sin u) — fy sin u, 

r=v+ fut 

s=f—v’. 

If the field is given in cartesian codrdinates by 6=g(z,y), then the 

parameter values of the tangent turbine of the field series at the element 
(x, y, = g(x,y)) are 


on 


Th 
Fr 
the 
the 
det 


obt 
lin 


elei 


a p 
por 
eler 
Cure 
tha 
cou 


den 


or b 


id 

4 


GEOMETRY OF TURBINES. 553 


a=2—Y¥/(go+ yy), 
b=y+1/(Gge t+ 


, 


cos(@ + a) = Vity?’ sin(@ + @) =o 


From these formulae, we easily obtain the following fundamental theorems 
on the structure of a field in the neighborhood of any one of its elements. 


THEOREM 14. Consider any field F and any element E of the field. 
Then we study the totality of series starting at E and contained in the field. 
From this totality, select that subset of series whose point-loci pass through 
the point of E in a given direction and whose line loci thereby necessarily touch 
the line of E ata fixed point. This subset (although tt contains © series) 
determines a unique tangent turbine. 


THEOREM 15. By varying the given direction in Theorem 14 we thus 
obtain co turbines. These turbines have their centers on a straight line. This 
line we shall call the central line relative to the gwen field F and the given 
element 


From Theorem 15 we obtain 


THEOREM 16. The outer circles of the «' turbines of Theorem 15 form 
a pencil in the sense of elementary circle geometry, that is, the circles have two 
points in common. One of the fixed points is, of course, the point of the 
element E and the other is a new point which we denote by P. The inner 
circles of the turbines form a pencil in the sense of higher circle geometry, 
that is, the circles have two lines in common. One of the fixed lines is of 
course the line of the element FE and the other is a new line which we 
denote by 1. 

The central line is given either by the equation 


«(cos u—f,sinu) + y(sinu + fy, cos u) + fu = 0, 
or by the equation 
— —2) + go(¥ —y) —1. 
The hessian codrdinates of this straight line are determined by 


1 fs 
+ sin(U —u) = 


fi’ 


cos(U — u) 


+ V1+ fr?’ 


i 


EDWARD KASNER AND JOHN DE CICCO. 


or 
— 


The tangent flat field. We say that the two fields f and F are tangent 
to each other at a common element /, if any two field series of the fields f and 
F respectively, which contain the element # and which have the property that 
either their line curves or their point curves have a common tangent element 
at EH, are such that their tangent turbines at H are identical. 

It is then obvious that two fields w= f(u,v) and w= F (u,v) are tan- 


gent at a common element LF if 
fu Fu, fe — Fy. 


Similarly the two fields 6 = g(z, y) and 6= G(za, y) are tangent at a common 
element F, if 
Ge, Gy 

We call the flat field, which is tangent to a field F at an element £, the 
tangent flat field of the field F at L. 

Let the flat field 

w=(v+b) tant(a—u) +6, 
be the tangent flat field of the field 
f(u, v), 
at the element E(u, v, w= f(u,v)). Then we must have 
a=u-+ 2 arc tan fy + 
b = — v — 2f,/(1+ fr”), 
c= f(u,v) + + fr”). 
One parameter family of fields envelope—Characteristics. The equation 
= f(u, v,a), 
defines a one parameter family of fields. 

The series, which is a field series of each of two consecutive fields of the 
family, is called a characteristic of the field. The locus of all the characteristics 
of the one parameter family of fields is a field and we call it the envelope of 
the one parameter family of fields. The equations 

w=f(u,v, a), fa(u, v,a) =0, 
for each a represent a characteristic of the family and, when we eliminate 4 
from the above two equations, the resulting equation represents the envelope 
of the family of fields. 


It is easily seen that the envelope is tangent to each member of the family 
of fields at all elements of its characteristics. 


neé 


Va 


th 


be 
| 
H 
a 
¥ 


GEOMETRY OF TURBINES. 


Or 
or 
or 


The locus of the elements, which are common to consecutive characteristics 
of a one parameter family of fields, is called the edge of regression. The 
eliminants, with respect to a, of the equations 


w= f(u, UV; a), fa(u, a) = 0, faa(u, v; a) 


give the equations of the edge of regression. 
It is obvious that the tangent turbines of the edge of regression and any 
characteristic at a common element are identical. 


Developable fields. The envelope of a one parameter family of flat fields 
is called a developable field. The characteristics of the one parameter family 
of flat fields are turbines and these turbines are called the generators of the 
developable field. 

Since each flat field is tangent to the envelope along its characteristic, 
it follows that the tangent flat field to a developable field is the same at all 
elements of a generator. The edge of regression of the developable is the 
series to which the generators are the tangent turbines. Moreover, since con- 
secutive generators are consecutive tangent turbines of the edge of regression, 
the osculating flat field of the series is that flat field of the family which con- 
tains these generators. But this flat field is tangent to the developable. Hence, 
the osculating flat field at any element of the edge of regression is the tangent 
flat field to the developable field. 


THEOREM 17. For the field w=f(u,v) to be a developable field, it is 
necessary and sufficient that 


(1 + fr? + 2fuv)? — 4fee(fuu + fufo) = 0. 


For the tangent flat field at the element (u,v, f(u,v)) has parameter 
values 
a=u-t 2arce tan f, + b= — v— 2f,/(1 + f.”), 
c= f+ (1 + fr’). 


The necessary and sufficient condition that b = b(a), c= c(a) is then 
seen to be the above equation. 


Conjugate fields of elements. We define the tangent turbines to any field 
series of a field to be the tangent turbines of the field. In hessian codrdinates 
the tangent turbines of a field are given by the parameter values 


a=—v'(sinu + f, cos u) — fy cos u, 
b = v’(cos u— f, sin wu) — fy, sin u, 


s=f—v’. 


556 EDWARD KASNER AND JOHN DE CICCO. 


The conjugate turbines of the above turbines are given by the parameter 
values 
= + f, cos u) — fy cos u, 
= v'(cos u— f, sin u) —f, sin u, 
fat 
s=—f+v’. 

Since the tangent turbines of the field w=f(u,v) at the element 
(u,v, f(u,v)) all contain the element (u,v, f(u,v)), the conjugate turbines 
of these turbines must also contain an element and it is unique. We call the 


~ 


aa 
& 


element (i, 0, the conjugate of the element (u,v, w). If is any element 
of the field w= f(u,v), then we denote the conjugate element by &. The 
element / is given in hessian codrdinates by the equations 

fo 

sin (i —u) 
D=v-+ 2fu/(1+ fr”), 

— f (u,v) — 2fufr/(1 + fr?) 


and in cartesian codrdinates, the element /# is given by the equations, 


cos (ij —u) = 


6 = — g(x,y) + 2 are tan + (2k + 1)z. 


From these equations, we obtain 


THEOREM 18. The necessary and sufficient condition, that the conjugate 
elements of a field be the elements of a field, is that the given field be non- 
developable. 


For, it is obvious that the necessary and sufficient condition, that the set 
of conjugate elements be at most a one parameter family of elements, is 


(1 fe 2fuv)? 4fov (fun fufv) 0, 


which means that the field w—f(u,v) must be a developable field. The 
theorem follows. 

If a field is non-developable, we term the field of conjugate elements the 
conjugate field, a fundamental concept in our theory. 

From this follows 


THEOREM 19. Lach tangent turbine of the conjugate field is the con- 
jugate turbine of each tangent turbine of the given field. From this it follows 
that two conjugate families of curves have the same osculating circles. 


0? 
01 


i d 
t 
d 
a 
‘ fe 
0 
| fe 
0 
al 
ti 
E 
fo 
6 


Or 
Or 
-2 


GEOMETRY OF TURBINES. 
For a non-developable field, the equations: 


T+ 


1—f,° 


Ath 
Vv + 2fu/(1+ fr’), 


define a line transformation. We call it the conjugate line transformation for 
the field. 
For a non-developable field, the equations 


X = — + 
Y=y + + 9’), 


define a point transformation. We call it the conjugate point transformation 
for the field. The following four results are deduced : 


THEOREM 20. For a line transformation to be a conjugate line trans- 
formation of a field, it is necessary and sufficient that the corresponding E on | 
of any element E on 1 be in projective involution with the element E’ on 1, 
which is the tangent element on 1 of the oriented circle which contains the 


element EL and which is tangent to the line 1. 


THEOREM 21. Let a line transformation be a conjugate line trans- 
formation. Then it is the conjugate line transformation of a unique field 
w= (u,v), which contains a given element (Uo, Vo,Wo). Moreover, any 
other field, of which it is the conjugate line transformation, is obtained by 


applying a slide to the elements of the field w = $(u, v). 


THEOREM 22. For a point transformation to be the conjugate point 
transformation of a field, it is necessary and sufficient that the correspondent E 
on P of the element E on P be in projective involution with the element E’ 
on P which is the tangent element on P of the circle which contains the element 
E and the point P. 


THEOREM 23. Let a point transformation be a conjugate point trans- 
formation. Then it is the conjugate point transformation of a unique field 
§=uY(a,y) which contains a given element (2, Yo, 9). Moreover, any other 
field, of which il is the conjugate point transformation, is obtained by applying 
a turn to the elements of the field y). 


EDWARD KASNER AND JOHN DE CICCO. 


The tangent turbines of a field. Let us consider the tangent turbines of 
the field. The parameter values of the turbines are 


a=—w(sin u + f, cos u) — fy Cos u, 
b = w(cos u— f, sin u) — fy, sin u, 
fut ufe 


s=f—uw, 


where the turbine determined by u, v, w is the tangent turbine of any field 
series which contains the element (u,v, f(u, v)) and whose line curve contains 
the tangent element (u,v, w) at the element (u, v, f(u, v) ). 

By means of the above equations we are able to prove the following 


theorems: 


THEOREM 24, For the set of tangent turbines of a field to be a three 
parameter family of turbines, it is necessary and sufficient that the field be not 
a flat field. 


THEOREM 25. For the set of tangent turbines of a field to be a two 
parameter family of turbines, it is necessary and sufficient that the field be a 


flat field. 


THEOREM 26. For a two parameter family of turbines to be the tangent 
turbines of a flat field, it is necessary and sufficient that the conjugate turbines 
all contain a given element. Moreover, the given element is the center of the 


flat field. 


THEOREM 27. The necessary and sufficient condition, that every one 
parameter family of turbines of the tangent turbines of a field possess an 
envelope, ts that the field be a flat field. 


THEOREM 28. For the tangent turbines of a field to be field series of the 
field, it is necessary and sufficient that the field be a flat field. 


Let us now consider the three parameter family of turbines 
v =a(dA, v)cosu + D(A, p, v)sinu+ r(A, p, v), 
w=—a(dA,p,v)sinu + b(A, v)cos u + s(A, v). 


Since the above set of turbines is a three parameter family of turbines, 
at least two of the jacobians 


D(a, b, r) D(a, b, s) D(a, r,s) D(b, s) 
D(A, v) D(A, », v) ‘ D(A, p; v) D(A, v) 


are not identically zero. 


5d8 
n 
i b 
2] 
f 
t 
j 


GEOMETRY OF TURBINES. 559 


A three parameter family of turbines, whose inner circles are all distinct, 
is called a general three parameter family of turbines. It is seen that the 
necessary and sufficient condition for a three parameter family of turbines 
to be a general set of turbines is that the jacobian 


D(a, b, r) 
D(A, 


be not identically zero. Hence, any general three parameter family turbines 


may be given by the equations 
v=acosu+ bsinu+r, w=—asinu+ bcosu+ s(a,b,r). 


A three parameter family of turbines, such that it consists of turbines 
which contain an element of a fixed series, is called a co-serial set of turbines. 


THEOREM 29. The necessary and sufficient conditions for a three para- 
meter family of turbines to be a co-serial set of turbines are that the family 


be a general set of turbines and that the equations 


+ = 1 + 5,7, 
(1 + ( See Sov) orr = (25q Sor (-— -+- SaSr) Sars 


be identically satisfied. 


The fixed series is uniquely determined and is given by the equation 


1+ 1+ 
v=acosu+ bsinu+r, w=—asinu+bcosu-+ s(a,b,r). 


THEOREM 30. For a three parameter family of turbines to be a set of 
tangent turbines of a field, it is necessary and sufficient that the family be not 
a co-serial set of turbines, that the family be a general set of turbines and 
finally that the equation 

+s? 


be identically satisfied. 


The field, to which the turbines are the tangent turbines, is unique and 


it is given by the equations 


’ 


v=acosu+bhsinu+r, w=—=—asinu+bcosu+ s(a,b,r). 


560 EDWARD KASNER AND JOHN DE CICCO. 


THEOREM 31. The necessary and sufficient conditions, that a three pa- 
rameter family of turbines be a set of tangent turbines of a field, are (1) that 
the set of turbines be not a co-serial set of turbines, (2) that the set be a general 
set of turbines, and (3) that, with each turbine T of the family and its con- 
jugate turbine NT, there be associated two unique elements E and E respectively, 
where the element E is on T and the element E is on NT, such that every one 
parameter set of enveloping turbines Q of the family has the property that, 
either the series, to which the turbines are the enveloping turbines, consists 
of the elements EH, each of which is on a turbine T of Q or that the series, 
(to which the one parameter family of enveloping turbines NQ, each of which 
is the conjugate turbine of a turbine of Q, is the enveloping set of turbines) 
consists of the elements E, each of which is on a turbine of NQ. 


THEOREM 32. If the series, to which the single infinitude of enveloping 
turbines NQ (or Q) are the enveloping turbines, consists of the element E 
(or H), then every element of the series, to which the single infinitude of 
enveloping turbines 2 (or NQ) are the enveloping turbines, is the element 
of a turbine T (or NT) of the enveloping turbines Q (or NQ), which is on 
the line q, such that the tangent line of the curve of centers of the turbines at 
the center of T (or NT) 1s the bisector of the angle, whose sides are the 
oriented line of EF (or E) on NT (or T) and the line q. 


A characteristic property of whirl transformations. We begin by con- 
sidering certain simple operations or transformations on the oriented lineal 
elements of the plane. A turn 7, converts each element into one having the 
same point and a direction making a fixed angle a with the original direction. 
By a slide S; the line of the element remains the same and the point moves 
along the line a fixed distance k. These transformations together generate a 
continuous group of three parameters, which we call the group of whirl trans- 
formations and which we denote by G;. It is easily seen that any whirl trans- 


formation may be put in the form * 
The slide S;, is 
The turn 7, is 


i=u+ a, T—vcosa+wsin a, w=—vsina-+ wcos 4. 


See Kasner, American Journal of Mathematics, 1911. The name whirl for TST 
was suggested by D. Sole in my seminar. Recently this theory has been extended to 
spherical geometry by K. Strubecker, Jahr. d. Math. Ver., vol. 44 (1934), pp. 184-195. 


who suggests the term turbine-rotation for whirl. 


It 


for 


an¢ 


Th 

4 

for 

tra 

cor 

tra 

i Th 

par 

jec 

the 
av 


GEOMETRY OF TURBINES. 561 
It is then seen that any whirl transformation may be given by the equations 
i=u+a-+ BZ, B=vecos(¢+ B) +wsin(a+ +ksin Bg, 
w=—vsin (a+ +wcos(a+ B) +k cos B, 


It is found that the only contact transformations of the set of whirl trans- 


formations are 


The first represents a dilatation D,; and the second, which may be written 
DT, represents a dilatation accompanied by reversal of orientation. 
Our group of whirl transformations may be written in the simple form 


* 5 


and hence any whirl transformation is given in the form 


i=u-+ a, 
G—vcosa+wsina+t d, 


We give now, without proof, a new characteristic property of whirl trans- 
formation in terms of the central lines of a field. 


THEOREM 33. For an element transformation to be such, that the cen- 
tral lines of every field w =f(u,v) are identical with the central lines of the 
corresponding field w =f (i, 7), it is necessary and sufficient that the element 


transformation be a whirl transformation. 


We remark that the group of whirls is isomorphic to the group of motions. 
These two groups are commutative and together generate a new group of six 
parameters, of considerable interest in the geometry of elements. 


Extension of Scheffer’s theory of isogonal and equi-tangential tra- 
jectories. First we state the following: 


THEOREM 34. If two fields are related by a whirl transformation, then 


the two conjugate fields are related by a whirl transformation. 


Let F and @ be two fields such that @ is obtained from F by applying 
a whirl transformation W to F. Then, by means of the above theorem, we 
know that there exists a whirl transformation W such that the two fields 
PF and G, the conjugate fields of F and G respectively, have the property that 


EDWARD KASNER AND JOHN DE CICCO. 


G is obtained from F by applying W to P. We call W the conjugate whirl 
transformation of W. 

Let F, and Sy be a fixed element and a fixed series respectively. There 
exists a unique one parameter family of transformations 7, which is a subset 
of the group of whirl transformations W, such that any transformation of 7 
carries #, into an element of Sy. It follows that with any element F of the 
plane there is associated a unique series S, such that any transformation of T 
carries the element H into an element of S. We define S to be the quasi-path 
series of # for the set of transformations T. It is seen that the set of quasi- 
path series for the set of transformations T is at most a three parameter family 
of series. We denote the totality of quasi-path series for the set of trans- 
formations 7' by &. 

We say that the one parameter family of transformations T is the con- 
jugate set of transformations of the set of transformations 7, if each trans- 
formation of T is the conjugate whirl transformation of a transformation of 
T and conversely. We denote any quasi-path series of T by S and the totality 
of quasi-path series of 7 by 3. We shall say that two series are conjugate with 
respect to T or T, if one is a series of % and the other is a series of 3. 

Now let us apply 7 to a one parameter family of curves F,. We then 
obtain o* new one parameter families of curves, or collectively, a two parameter 
family of curves F,. Similarly let us apply T to the conjugate family of 
curves /, of the family of curves F,. From Theorems 33 and 34 we obtain: 


THEOREM 35. Consider any one of the quasi-path series S of the set 
x connected with T. Each element of S determines a curve of the doubly in- 
finite system of curves F. generated by applying T to any simply infinite 
system F,. The locus of the centers of the 21 circles osculating these curves 
at these elements is a straight line. Hence these circles touch a certain series 
5 of the set & conjugate to % with respect to T. 


THEOREM 36. According to the previous theorem, the system F,. obtained 
by applying T to F, induces a definite correspondence between the set of series 
= and the conjugate set 3. There exists another system F., obtained by ap- 
plying T to F,, for which this correspondence is precisely reversed. 


If we place on T the restriction that it be a group of transformations, 
then the quasi-path series become path series and moreover the path series are 
turbines all congruent to each other. The set of transformations 7 is also a 
group of transformations and its path series are turbines, which are the con- 
jugate turbines of the turbines which are the path series of 7. Thus we 


4 
562 
ol 
gt 
ta 
n 
d 
0: 
al 
bi 
ta 
| 
b 
th 
té 
( 
b 
n 
I 
| 


GEOMETRY OF TURBINES. 363 


obtain the following two theorems due to Kasner* which are themselves 
generalizations of Scheffer’s ° fundamental theorems on isogonal and equi- 


tangential trajectories: 


THEOREM 37. Consider any one of the path turbines S of the set & con- 
nected with the one parameter group of transformations T. Hach element of S 
determines a curve of the doubly-infinite system F. generated by applying T 
fo any simply-infinite system F',. The locus of the centers of the «+ circles 
osculating these curves at these elements is a straight line. These circles touch 
a certain turbine S of the set % conjugate to %. 


THEOREM 38. According to the previous theorem, the system F, obtained 
by applying T to PF, induces a definite correspondence between the set of 
turbines & and the conjugate set 3. There exists another system F., obtained 
by applying T to F,, for which this correspondence is precisely reversed. 


In, Theorems 37 and 38, if we let 7 be first the group of turns and then 
the group of slides, we obtain Scheffer’s theorems on isogonal and equi- 
tangential trajectories. (Part of Scheffer’s first theorem was discovered by 
(esaro). The most general families whose central loci are straight lines have 
been studied by Kasner ® (velocity families and the dual type). In this con- 
nection we may obtain characterizations of both the conformal and the equi- 


long groups. 


CoLUMBIA UNIVERSITY, 
New York. 


*American Journal of Mathematics, 1911. 

5 Mathematische Annalen, 1905. 

° Princeton Colloquium, 1913, 1934, American Journal of Mathematics, 1910, and 
several abstracts in Bulletin of the American Mathematical Society, 1930-1935. 


t 


PARALLELIS AND EQUIDISTANCE OF CONGRUENCES OF 
CURVES OF ORTHOGONAL ENNUPLES.* 


—_ By R. M. PETErs. 


It is the purpose of this paper to develop some theorems relating to angular 
and distantial spreads‘ of congruences of curves of orthogonal ennuples lying 
in an n-dimensional Riemannian space V». We shall be particularly concerned 
with congruences belonging to a sheaf; that is, to a totality of oo” con- 
gruences of which each two cut under a constant angle. It is assumed that 
the linear element of the V,, is defined by the positive definite quadratic form 
ds? == g;;dx‘dxi, the x’s being coordinates of the Vn, and the gj;’s real analytic 
functions of the z’s. The curves considered are assumed to be real and analytic. 

We begin by considering the angular spreads (or associate curvatures, in 
Bianchi’s terminology) of n mutually orthogonal congruences Cy, (h = 1, 2, 3, 

- +,m), with respect to any fixed congruence of curves C. Let Aa|* and é be 
respectively the unit vectors tangent to the curves C, and C. Then 


= cos a An|*, cos? a, = 1, 
h h 


where a is the angle between the curves C;, and C. We denote by y:|/ the 
angular spread vector, and by 1/r; its length, the angular spread, of the curves 
C; with respect to the curves C. Then 


(1) = >> COS Fi = COS pain), 
h h 


where y|/ is the angular spread vector of the curves C; with respect to the 
curves Cy, and pii|/ the first curvature vector of the curves C;. From (1) 


we conclude 


THEOREM 1. Jf the curves Ci of one congruence are geodesics and are 
parallel with respect to the curves of every other congruence of the ennuple, 
then the curves C, are parallel with respect to the curves of every other con- 
gruence in V,; that ts, their tangent vectors form a field of parallel unit vectors. 


* Received February 20, 1937. 

1¥For the definition and significance of these terms see Graustein, “The geometry 
of Riemannian spaces,” Transactions of the American Mathematical Society, vol. 36, 
no. 3, p. 555, and Peters, “Parallelism and equidistance in Riemannian geometry,” 
American Journal of Mathematics, vol. 57 (1935), pp. 103-111. 


564 


con 
and 
of t 


whe 


enn 
(6) 
spre 


tion 


qua 


we 

(2 

an 
(3 

whe 
Sul 
(4) 
anc 
(5) 
(7) 
Sin 
p. 


PARALLELISM AND EQUIDISTANCE OF CONGRUENCES. 565 


THEOREM 2. If the curves of each congruence of the ennuple are geodesics 
and are parallel with respect to the curves of every other congruence of the 
ennuple, then their tangent vectors form n fields of parallel vectors.? 

Introducing the coefficients of rotation ypqr of the orthogonal ennuple, 


we have 


r r 

and 

(3) 1/7? nx = 


r 


where 1/7 is the angular spread of the curves C, with respect to the curves C;.. 
Substituting (2) in (1), 


(4) = — cos yrin 
rsh 
and 
(5) COS Zp COS Yrih Yrlk- 


Now let us consider a second orthogonal ennuple of mutually orthogonal 


congruences of curves Cy, (hk =1,2,- -,n), with unit tangent vectors 


and let both ennuples belong to a sheaf. Expressing the vectors An|* in terms 


of the vectors Ax|*, we have 


= cos Bx Ak | 
k 


where Bu; is the constant angle between the curves C;, and C;. Since the two 


ennuples are orthogonal, the angles Bix satisfy the relation 


(6) > cos Bax COS Bar = 81", 
h 


8" being the Kronecker delta. 
Let pa|/ be the angular spread vector and 1/7, its length, the angular 
spread, of the curves C, with respect to the curves C. We inquire what rela- 


j, and between the 


tions exist between the two sets of nm vectors pa|/ and px 
quantities 1/7, and 


— 
~2 
— 


k k 


Since C, may be any congruence of the sheaf, we can state 


*The ennuple is then Cartesian and the V,, is Euclidean. See Graustein, loc. cit., 
Pp. 564, 


9 


566 R. M. PETERS. 


THEOREM 3. The angular spread vector of any congruence of curves 6 
belonging to a sheaf with respect to an arbitrary congruence of curves C is 
linearly expressible in terms of the angular spread vectors of any orthogonal 
ennuple of curves Cy of the sheaf with respect to the same curves C, the 
coefficients of combination being the cosines of the constant angles between 
the curves C;, and C. | 


THEOREM 4. Jf the curves of any orthogonal ennuple of congruences of 
the sheaf are parallel with respect to an arbitrary congruence of curves C, then 
the curves of every congruence of the sheaf are parallel with respect to the 
curves C. 

This result combined with Theorem 2 gives 

THEOREM 5. If the curves of each congruence of any orthogonal ennuple 
of a sheaf are geodesics, and are parallel with respect to the curves of every 


other congruence of the ennuple, then every congruence of curves of the sheaf 
is parallel with respect to every congruence of curves in the Vy. 


For the angular spread 1/7, we have 


1/7)? = > cos Bri cos Bri Gij Pl 
k,l 


Summing over / and using (6) 
(8) 1/7" = > 1/r}?. 
h h 


In this result is contained a theorem given by Bortolotti: * the sum of the 
squares of the angular spreads of the curves of an orthogonal ennuple of con- 
gruences of a sheaf with respect to an arbitrary congruence of curves is the 
same for every orthogonal ennuple of the sheaf. 

We shall show that corresponding facts hold for distantial spread vectors, 
providing the curves C, previously chosen arbitrarily, are required to belong 
to the sheaf. 

Let vi|/ be the angular spread vector of the curves C with respect to the 
curves C;. Then 

= Arlt 


== (cos Zn i 0 cos ) 
h 


h 


® Bortolotti, “ Stelle di congruenze e parallelismo assoluto,” Rendiconti dei Lincet 
(6), vol. 9 (1929), pp. 530-538. 


wh 


in 1 


(9) 


whe 
Ch. 
in | 


iden 
surf 
curv 


and 

curv 
inte? 
cons: 
res pe 


in fe 
(10) 


form 


have 


then 


grue 


d 
Ci, W 


taken 
5 


PARALLELISM AND EQUIDISTANCE OF CONGRUENCES. 567 


where s! is the arc of the curves C; and 0/ds' denotes directional differentiation 
in the positive direction of the curves C:. 
If we denote by b:|/ the distantial spread vector * of the curves C; and (C, 


bi = wal? — val 
= [cos an (pial? — par|4) — cos 
h 


(9) = (cos an bin|i — cos an/0s"), 
h 


where bi|/ is the distantial spread vector of the congruences of curves (; and 
(> Formula (9) holds when the curves C are the curves of any congruence 
in Vn. 

We recall that the distantial spread vector of two congruences vanishes 
identically if and only if the two congruences lie in a family of two-dimensional 
surfaces, and the curves of each congruence are equidistant with respect to the 
curves of the other.’ Hence we conclude from (9) 


THEOREM 6. Jf the distantial spread vectors formed for the curves Ci 
and every other congruence of curves of the ennuple are null vectors, then the 
curves C, are equidistant with respect to the congruences of curves C which 
intersect the curves of all congruences of the ennuple at angles which are 
constant along the curves Ci. In particular, the curves Ci are equidistant with 


respect to the curves of all congruences belonging to the sheaf. 


If we require that the curves C' belong to the sheaf, (9) becomes analogous 


in form to (1). 


h 
An ennuple is called a Tchebycheff ennuple ° if the distantial spread vector 
formed for each two congruences of the ennuple is a null vector. Hence we 
have from (10) 


THEorEM 7%. Jf the orthogonal ennuple is an ennuple of Tchebycheff, 
then the curves of the ennuple are equidistant with respect to every other con- 


gruence of curves of the sheaf, and vice versa. 


Returning to the second orthogonal ennuple of curves C;, of the sheaf, let 
v1|/ denote the angular spread vector of the curves C with respect to the curves 
(i, where the curves C are again arbitrary. 


‘For the definition of the distantial spread vector, see Graustein, loc. cit., p. 555. 
taken in the order named, then 
°Graustein, loc. cit., p. 559. 
*Graustein, loc. cit., p. 563. 


568 R. M. PETERS. 


(11) =A = cos Bix Ax|* = cos Bue 4. 
k 


We note that (11) holds regardless of whether or not the angles Bix are 
constant. That is, (11) gives the relation between the angular spread vectors 
of an arbitrary congruence of curves C with respect to the curves of two 
arbitrary orthogonal ennuples. 

For the angular spread, 1/p:, of the curves C with respect to the curves C), 
we have 


(12) 1/p.? = cos Bin cos Bix giz vnl* vel 
h.k 

= > COs Bin cos Bix cos prpks 


where 6, is the angle between the vectors va|* and vi|4, and 1/pn is the angular 
spread of the curves C with respect to the curves Cy. In particular, if we take 
the curves C; as coincident with the curves C, we obtain 
(13) 1/p? = COS & COS & COS 

1/p being the first curvature of the curves C.’ 

From (11) and (13) it follows that if the curves C of an arbitrary con- 
gruence are parallel with respect to the curves of any orthogonal ennuple of 
congruences, then the curves C are parallel with respect to the curves of every 
congruence in the V,; that is, the tangents to the curves C form a field of 
unit parallel vectors. The curves C are then geodesics. 

Summing over / in (12) we obtain 


(14) 1/pi° 1/pi°. 


This formula gives a theorem corresponding to the one quoted from Bortolotti, 


the réles of the curves C and C) being interchanged. 


THEOREM 8. The sum of the squares of the angular spreads of the curves 
of an arbitrary congruence with respect to the curves of an orthogonal ennuple 


is independent of the ennuple chosen. 


Returning to the consideration of distantial spreads we have from (7) and 
(11) for the distantial spread vector 6:|/ of the congruences C; and C, in the 


order named, 


7 This is a generalization of a form of Liouville’s formula for geodesic curvature 
in a V, given by Graustein, Transactions of the American Mathematical Society, vol. 34, 


no. 3, p. 571. 


cur 


cl 
res} 


vect 
(16 
ar 


thec 


spre 
the 
d ep 


mut 


wh 
she 
to 
cul 
clu 
of 
ter 
of 
COs 
gru 
is ¢ 
of 


PARALLELISM AND EQUIDISTANCE OF CONGRUENCES. 569 


’ 


(15) bi |4 = = cos Bux Di. 
k 


where the curves C; and Cx now belong to orthogonal ennuples of the same 
sheaf so that the angles Bx, are constants. This result is entirely analogous 
to formula (7) for the angular spread of the curves C; with respect to the 
curves C’, in both cases the curves C’ being entirely arbitrary. We draw con- 
clusions analogous to Theorems 3, 4, and 5. 


THEOREM 9. The distantial spread vector of any congruence of curves C 
of a sheaf and an arbitrary congruence of curves C is linearly expressible in 
terms of the distantial spread vectors of any orthogonal ennuple of curves Ch 
of the sheaf and the same curves C, the coefficients of combination being the 


cosines of the constant angles between the curves Cy and C. 


‘and each con- 


THEOREM 10. Jf, for an arbitrary congruence of curves C 
gruence of any orthogonal ennuple of the sheaf, the distantial spread vector 
isa null vector, then the distantial spread vector of C and each congruence 


of the sheaf is also a null vector. 


Furthermore, using Theorem 7 and requiring that the congruence of 


curves C’ belong to the sheaf, we have 


THEOREM 11. Jf an orthogonal ennuple of the sheaf is an ennuple of 
Tchebycheff, then every congruence of curves of the sheaf is equidistant with 


respect to every other congruence of the sheaf. 


We note that this result includes that of Theorem 7. 

Let 1/b; and 1/b, denote respectively the lengths of the distantial spread 
vectors b:|/ and b,|/. Multiplying (15) by gi; bi|*, summing over 1, and 
using (6) we obtain 


(16) ¥ 1/b2 = 5 1/b2, 
l l 


a result analogous to (8) and (14) for angular spreads, which gives the 


theorem corresponding to Bortolotti’s theorem and Theorem 8. 


THEorEeM 12. The sum of the squares of the lengths of the distantial 
spread vectors formed for the curves of an arbitrary congruence of a sheaf and 
the curves of every congruence of an orthogonal ennuple of the sheaf 1s in- 


dependent of the ennuple chosen. 


Instead of the single congruence of curves C, let us now consider n 


mutually orthogonal congruences C*,, not necessarily belonging to the sheaf, 


570 R. M. PETERS. 


with unit tangent vectors &|‘. In the previous work we attach an h to each 
symbol formed with respect to the curves C; for example, a,» now denotes the 
angle between the curves Cy and C*,. Formula (5) becomes 
(17) 1/r#? == COS COS Yrhp Yrha, 

r 


where we have replaced 1/ra, the angular spread of the curves Cy with respect 
to the curves C, by 1/r*nx, the angular spread of the curves C; with respect to 
the curves C*,. Summing over k, we have 


(18) — 


rsp 


since the angles a satisfy the relation 
Dd COS Spx COS Gq, Sy”. 
k 


Using (3), (18) becomes 


(19) x 2 1/7? 

and, summing over h, 

h, h,k 


Incidentally we note that (19) is essentially identical with (14). 
Since we have seen from formula (8) that > 1/r*? is independent of the 
h 
ennuple of curves Cy, and since (20) shows that } 1/r*? is independent of 
hyk 


the ennuple of curves C*;, we conclude 


THEOREM 13. The sum of the squares of the angular spreads of each 
congruence of curves of an orthogonal ennuple of a sheaf with respect to the 
curves of each congruence of an arbitrary orthogonal ennuple is the same for 
any choice of both ennuples. 


Formula (18) furnishes incidentally a proof of the known fact that the 
quantity >) 1/r?n is the same for every orthogonal ennuple of the sheaf.* 
If we denote by b*;|/ the distantial spread vector of the curves Cy with 
respect to the curves C*;, (10) becomes 


(21) b*ix|4 = cos Aik 
l 


where, we recall, the curves C*, now belong to the sheaf. Multiplying by 
gi; b*x|* and summing over i, j, k, and h, we have 


*Graustein, Transactions of the American Mathematical Society, vol. 36, no. 3, 
p- 579, and Bortolotti, loc. cit. 


fol 


ar 


wl 
th 
sp 
wt 
| 
en 
| 
| N 
| | 
| 
| 
| 


PARALLELISM AND EQUIDISTANCE OF CONGRUENCES. 571 
(22) 1/b%? = > 1/b?nx, 
hk 


where 1/b*); is the length of the vector b*nx|/. 
By (16) > 1/b*? is independent of the ennuple of curves Cy, chosen from 
h 


the sheaf, and by (22) 3} 1/b%° is independent of the choice of the ennuple 
hk 
of curves C’*;. Hence we have the following 


THEOREM 14. The sum of the squares of the lengths of all the distantial 
spread vectors of the curves Cy of an orthogonal ennuple of a sheaf formed 
with respect to all the curves of a second orthogonal ennuple of curves C*; of 
the sheaf is independent of the chowce of both ennuples. 


Here also we have an indirect proof of the fact that } 1/b?n is the same 
hyk 


for every orthogonal ennuple of the sheaf.® 

Let us now consider the special case when the sheaf contains an orthogonal 
ennuple of normal congruences. If we take these as the congruences C4, 
(18) becomes, since yrip = 0 for r, h, p all distinct, 


= + = DV + 
k r r 
1/rm being the first curvature of the curves C;. Summing over h 


isk 
Now (2) becomes 
and hence 
Dax | * = — yunn An|*, 

1/07 nk = y? nck + 

Summing over / and k, 


1/b7 nx 2 > 1/r? nn. 
h 


Hence from (20), (22), and (23) 


hyk h h, 


hyk 


Equations (24) show that we have to deal with the following five 


properties : 


® Graustein, loc. cit., and Bortolotti, loc. cit. 


R. M. PETERS. 


(A) 1/rm = 0, (h = 1, 2,- -,): Curves Ch geodesics. 

(B) 1/rm =0, (hi k =1,2,---,n; Curves Cy parallel with 
respect to the curves Cx. 

(C) = 0, =1,2,: Curves Ch equidistant with respect 
to the curves Cx. 

(D) 1/r*nx = 0, (h,k =1,2,- Curves Cy parallel with respect 
to the curves C*, of any orthogonal ennuple in Vy. 


(E) 1/b*m =0, =1,2,---,n): Curves equidistant with 
respect to the curves C*;, of any orthogonal ennuple of the sheaf, and vice versa, 
and each pair of congruences Cy, and C*; lying in a family of two-dimensional 
surfaces. 

From (24) we conclude 

THEOREM 15. If an orthogonal ennuple of normal congruences of curves 
Cy has any one of the above properties, then it also has the remaining four, 


and all congruences of the sheaf consist of geodesics. 

Let us add a few properties to the above list so as to summarize and extend 
some of our previous results. We rewrite (D) and (E) in slightly different 
form: 

(D) Curves Ch, (h =1,2,: +> -+,), parallel with respect to the curves 
of every congruence in Vy. 

Curves Cr, (hk = 1, 2,- +, equidistant with respect to the curves 
of every congruence in the sheaf, and vice versa. 

(F) The curves of every congruence of the sheaf parallel with respect to 
the curves of every other congruence in Vy. 

(G) The curves of every congruence of the sheaf equidistant with respect 
to the curves of every other congruence of the sheaf, and each pair of con- 
gruences lying in a family of two-dimensional surfaces. 


(H) Curves Cy, (h = 1, 2,- - -,m), all normal. 


By Theorem 2, properties (A) and (B) lead to (D) ; by Theorem 5, these 
same properties lead to (F), a result superseding the former since (F) in- 
cludes (D). 

By Theorem 7, (C) leads to (E) ; and by Theorem 11, (C) leads to (G), 
(G) including (FE). 


1° Part of this theorem, to the effect that if (C) holds, then (A) and (B) hold, and 
conversely, has been proved by Graustein, loc. cit., p. 564, for a non-orthogonal ennuple 
in which the curves of each two congruences intersect at an angle constant along the 
curves of both congruences. 


it ce 


(C) 


ing 


a Se 
vect 
of ¢ 
Let 

The 


Den 
to tl 


(25) 


whe) 


572 

| 

| 

| 

| 

| 

For 

(26 

The 

| (27) 

| 

| and 

| (28) 

} 

1 


PARALLELISM AND EQUIDISTANCE OF CONGRUENCES. 573 


We recall that if an orthogonal ennuple is an ennuple of Tchebycheff, 
it consists of normal congruences.**_ Furthermore, if (B) is valid, then so is 


(C). Hence we have finally 


THEOREM 16. Jf either property (B) or (C) holds, then all the remain- 
ing properties hold. 


We shall now denote by p the sheaf containing the curves Cy, and consider 
a second sheaf p’ and the relations between the angular and distantial spread 
vectors of the curves of the two sheaves with respect to an arbitrary congruence 
of curves with unit tangent vectors é'. Let (hk be the 
curves of any orthogonal ennuple of p’, and ’,|* their unit tangent vectors. 
Let Bx be the angle between the curves C’, and C;., where Bax is now a variable. 
Then 

= cos Bax Ax| 
k 


Denote by w’,|* the angular spread vector of the curves 0’, with respect to the 

curves C’, and by v’,|* the angular spread vector of the curves C' with respect 
to the curves C’,. Then 

(cos Ax| + cos Bix ) 
k 
(25) >> (cos Br Pk 4 + ig cos Bur/Os), 
k 


where s is the are of the curves C. And 

Vv Xv n|? — > COs 

For b’,|#, the distantial spread vector of the curves C’, and C, we have 
(26) |* — v’n|* (cos bi.|* + Ax|* cos 

The relations hetween the lengths of the vectors in question are given by 


h h 
+ ¥ (4 cos 


h.k 
and 
(28) 1/b’,? 1/bi? 2 b,.| i cos Bui 0 cos Bui /Os 
h h h.k,l 
+ (0 cos Bni/Gs)*. 
h,k 


Graustein, loc. cit., p. 563. 


574 R. M. PETERS. 


The equations (25) through (28) can, of course, be regarded as the 
relations between the angular and distantial spread vectors of the n congruences 
of any two orthogonal ennuples with respect to an arbitrary congruence of 
curves C’, without bringing in any notion of sheaves. 

From (27) and (28) we conclude 


THEOREM 17. The sums of the squares of the lengths of the angular or 
distantial spread vectors of any orthogonal ennuples of the two sheaves are the 
same with respect to the curves along which the angles Bn, are constant.'* 


In particular, if all the curves C, of the ennuple of p are parallel with 
respect to a congruence of curves C along which the angles Bix, are constant, 
then all curves of both sheaves are parallel with respect to the curves C. A 
corresponding result can be stated for equidistance. 

If the m congruences C;, are normal and consist of geodesics, then (25) 


becomes 

and 
(29) = > (0 cos 
Hence, 


THEOREM 18. The square of the length of the angular spread vector of 
any congruence of curves CO’ in a Euclidean Vn with respect to the curves of 
any other congruence of curves C is equal to the sum of the squares of the 
directional derivatives in the direction of the curves C of the angles By between 
the curves CO’ and the curves O;, (k =1,2,: -+,n), of an orthogonal ennuple 


of normal congruences of geodesics.** 


If the curves C belong to the sheaf p, and again the n congruences of 
curves C, are normal and consist of geodesics, then the curves C are also 


geodesics, and (26) becomes 


b’n|* = = Ax|* cos Bax 
k 


and 


— — (8 008 


LAKE ERIE COLLEGE. 


#2 A generalization of a result for angular spreads in a V, given by Grausteil, 
“ Parallelism and equidistance in classical differential geometry,” Transactions of the 
American Mathematical Society, vol. 34 (1932), p. 570. 

18 This theorem is a generalization of Theorem 14 of Graustein, loc. cit., p. 570. 


cc 
tl 
se 
co 
cy 
If 
th 
is 
be 
| su 
| 
| sa 
| 
| gr 
| im 
| 1- 
un 
| 305 
| con 
Ma 
G. 
The 


ON THE NON-ALTERNATING IMAGES OF LINEAR GRAPHS.* 


By Dick Wick HALL. 


Let A and B be compact metric spaces and T(A) = B a single valued 
continuous transformation. Then T is said to be non-alternating * provided 
that for no two distinct points 2 and y of B does the set T-*(x) separate the 
set T-*(y) in A, i. e., there exists no separation A—T™-!(x) =A,+ Az 
where If for each the set T(z) is 
connected, then 7’ is said to be monotone.? A connected set M is said to be 
cyclic if it contains no cut point, i. e., if M — zx is connected for every x in M. 
If M be a locally connected continuum, and we shall always assume that it is, 
then a subset # of M will be called a maximal cyclic subset if and only if it 
is not a proper subset of any other cyclic subset of M. A subset F of M will 
be called a cyclic element* of M provided £ is either (a) a maximal cyclic 
subset of M, (b) a cut point of M, (c) an end point‘ of M. A cyclic element 
containing more than one point is called a true cyclic element. An arc A is 
said to span a point-set M if A has its end points but no other points in 
common with M. Throughout this paper we shall assume that all the linear 
graphs mentioned are connected, and that all the point-sets considered are 
imbedded in a three dimensional Euclidean space, since we deal only with 
1-dimensional sets and any such set is topologically contained in an £3. 

In this paper a study is made of the possible images of a linear graph 
under a non-alternating transformation. It is shown that: I: A necessary 
and sufficient condition that a cyclic curve C be the non-alternating image of 


* Received July 23, 1936; Revised February 11, 1937. 

See G. T. Whyburn, American Journal of Mathematics, vol. 46 (1934), pp. 294- 
302. This paper will be referred to as W. 

*This terminology has been suggested by C. B. Morrey. See his paper “ The 
topology of path surfaces,” American Journal of Mathematics, vol. 57 (1935), pp. 17-50. 

*See Kuratowski and Whyburn, Fundamenta Mathematicae, vol. 16 (1930), pp. 
305-350. 

*A point p is an end point of a locally connected continuum M provided that M 
contains no simple are having p as an interior point. See R. L. Wilder, Fundamenta 
Mathematicae, vol. 7 (1925), p. 358. For this particular definition of end point see 
G. T. Whyburn, Transactions of the American Mathematical Society, vol. 29 (1927), 
Theorem 12, p. 385. 

575 


576 DICK WICK HALL. 


a linear graph is that C be the sum of a finite number of simple arcs; II: 
If B be the non-alternating image of a linear graph A, then every true cyclic 
element of B is the sum of a finite number of simple ares; III: Every curve 
C which is the sum of a finite number of simple arcs is the non-alternating 


image of a linear graph. 


THEOREM 1. Jf A be a linear graph and T(A) =B 1s non-alternating, 
then every true cyclic element of B is the sum of a finite number of simple 


arcs. 


Proof. By (W, 3.4), if Hy, be any true cyclic element of B there exists 
a non-alternating transformation W(A) =, such that none of the sets 
W-*(x) separate A; and by (W, 3.5), there exists a true cyclic element 
of A such that W(#,) =F). Then none of the sets W-1(z) separate F,, 
since any set which did so would also separate A which is impossible. 


Now since /, is a true cyclic element of a linear graph we may write 

Ea => Ai, where k is finite and A; = a,b; is the closure of a free arc Aj. 
1 


Then W is monotone on Aj, for all 7. Otherwise, for some x in H,, W (2) 
would separate 2,, which is impossible. Two cases may arise: (a) If 
W(ai) = W(bi) =c; for any i, then A; maps into a single closed curve 
having only the point c; in common with the rest of H,. Consequently, since 
Ey, is cyclic we must have W(A;) =, a simple closed curve. Hence 
7, is the sum of two simple arcs and the theorem follows. (b) If W(ai) 
~ W(b;) for any 71, then W is monotone on A; for all 7. Thus W(4;)=B; 


is a simple arc, and FL, = > B;, which is the theorem. 
1 


Lemma 1. Jf K be a cyclic curve such that there exists a linear graph 
H and a non-alternating transformation T(H) = K, and if A be any simple 
are such that K + A is cyclic, then there exists a simple arc B spanning H, 
and a non-allernating transformation Z(H] + B) =K + A. 


Proof. We may assume that H is cyclic, by (W, 3.5). Let a’, a” be the 
end points of A, and b’, b” any points of T-1(a’) and T-!(a’’) respectively. 
Let B be a simple are spanning // and having D’ and b” as end points. Define 
Z so that it is identical with 7 on H, while on B it is a homeomorphism send- 
ing B into A and such that Z(b’) =a’, and Z(b”) =a”. Then 
B)=K-+A4. 


cal 
Is 
im 
H 
th 
nit 
bo 
m 
m 
an 
| be 
fin 
Le 
po 
co 
a, 
a 
of 
C 
12 


NON-ALTERNATING IMAGES OF LINEAR GRAPHS. ‘ 


Moreover, 7’ is non-alternating. Otherwise, we could find two points 
z,y in K +A such that Z-*(x) separated Z-"(y) in H+ B. Now (1): 2 
cannot lie in A—K-A. Otherwise, Z-'(x), which is a single point since Z 
is one-to-one on B—B- H, would separate the cyclic set H + B, which is 
impossible. (2) y cannot lie in A—K-A, since Z is one-to-one on 
H — BH, and no single point can be separated. (3) Consequently, both a 
and y must lie in A. Since Z-*(x) separates Z-'(y) in H + B, it follows that 
Z*(y) contains two points y’ and y” such that Z-*(x) separates y’ and y” in 
H+ B. Now not both 7 and y” may lie in H, since T is non-alternating on 
this set. Hence we may assume that y’, say, lies in B—B-H. By the defi- 
nition of Z, B- Z-'(x) is a single point, hence Z-'(x) cannot separate y’ from 
both 0’ and b” say not from 6’. Then, since Z-'(x) separates y’ from y’”, it 
must separate b’ from y”. But we may assume that ye H/, and hence Z-1(2) 
must separate two points of H. Thus 7-*(z) must separate two points of /7; 
and consequently, by (W,1.41), x is a cut point of K + A, which is a 
contradiction. 

Therefore, 7’ is non-alternating, and the lemma is proved. 

THEOREM 2. A necessary and sufficient condition that a cyclic curve C 
be the non-allernating image of a linear graph is that C be the sum of a 


finite number of simple ares. 
Proof. Necessity: This is immediate from Theorem I. Sufficiency: 
n 
Let C = 3 Aj, where n is finite and each A; is a simple arc. Let the 2n end 
1 


points of these simple ares be denoted by a, d2,° - +, den, Where an end point 
is counted once for each are of which it is an end point. Since C is cyclic, it 
contains a simple closed curve K, passing through a, and a,. If K, does not 
contain all the points a;, let aj be any one of these points which it does not 
contain. Then, by the three point theorem ° we may find a simple are spanning 
K, and containing a;. Thus we have found a cyclic linear graph containing 
M, M2, aj. Repeating this process a finite number of times we shall obtain 
a cyclic linear graph K which is a subset of C and which contains all 


of the points a; Thus K+ A; is cyclic for every 1. We now write 
n 

C= K + > Ai, and the theorem follows at once by » applications of Lemma 


1, and the addition of one of the ares A; to K at each step. 


'See W. L. Ayres, Bulletin Académie Polonaise Science et Lettres (1928), pp. 
127-142. 


4 


578 DICK WICK HALL. 


By a 6,-curve we shall mean a curve expressible as the sum of (n + 2) 
simple arcs having the same end points but otherwise disjoint by pairs. 

Using the same construction as that employed in Theorem 2, the following 
lemma is immediate. 


LemMMA 2. Lvery cyclic curve C expressible as the sum of n simple ares 
having the same end points is the non-alternating image of a O,-curve. 


It can be shown that no 6, curve is the image of a 6,-curve under a non- 
alternating transformation. Consequently, since a 6,-curve A is easily expres- 
sible as the sum of two simple arcs having as end points two points interior 
to different free arcs of A, it follows that the n in Lemma 2 cannot be 
reduced. For by (W,4.6) A is not the non-alternating image of a 6 -curve, 
that is, of a simple closed curve, and hence not the non-alternating image of 
any 6,-curve for k < 2. 

Our next lemma will remove the restriction that C be cyclic. 


Lemma 3. Every curve C which is the sum of n simple arcs Aj 
(i=1, 2,---,m) having the same end points a and b is the non-alternating 
image of a O curve. 


Proof. For notational reasons we give the proof for the case n= 2, 
since the general case follows in precisely the same way. 

Since every arc in ( joining a and b must be in every A-set ® containing 
a and b it follows at once that C = C(a, b), that is, C is a simple cyclic chain 
joining a and b. 

Let K be the set of all points separating a and b in C. Then, since C 
is a simple cyclic chain, we may write C = (K +a+b) +3 (i, where each 
C; is a true cyclic element of C. 


4 
Define a 6.-curve H = > a’x;b’, where the a’x;b’ are simple arcs having 
1 


the same end points but otherwise disjoint by pairs. We shall prove that 0 
is the image of H under a non-alternating transformation. 

Let axb be any simple are joining a and 6 in C. Then, by the definition 
of K, it follows that K is a subset of arb. Let Z;(a’x;b’) = azb be a homeo- 
morphism defined on a’z;b’ (for each 7) and sending a’ and DB’ into a and J, 
respectively. Let Ki =Z;1(K +a+ 5). Define a transformation Z of H 


° For definitions of the new terms used see G. T. Whyburn, American Journal of 
Mathematics, vol. 50 (1928), pp. 167-194. In connection with A-sets see also Kura- 
towski and Whyburn, loc. cit. 


int 
ho 
(K 
H’ 
an 
of 
tw 
evi 
Fr 
| tre 
su 
| tid 
tr 
fo 
W 
is 
az 
or 
| a: 
| 
| as 
: of 
| 
I 
| sl 
| = tl 
m 
| 


NON-ALTERNATING IMAGES OF LINEAR GRAPHS. 579 


into a new curve H’ as follows: (i) Z is identical with Z; on Ki, (ii) Z is a 
4 

homeomorphism on H — > K;. Then H’ will coincide with C at all points of 
1 


(K-+a-+ 6). Moreover, by the definition of Z every true cyclic element of 
H’ will consist of four simple arcs disjoint by pairs except for their end points 
and joining two points of (K +a-+ 5) inC@. Thus every true cyclic element 
of H’ will be a @.-curve. But every true cyclic element of C is the sum of 
two simple arcs having their end points in common and hence, by Lemma 2, 
every such true cyclic element is the non-alternating image of a 6.-curve. 
From the proof of Theorem 2 it follows that we may pick non-alternating 
transformations sending the true cyclic elements of H’ into those of C in 
such a manner that they will map the points of H’; K (where H’; is the 
given true cyclic element) into themselves. Thus we may define a transforma- 
tion W(H’) —C to be the identity transformation on K, and to send each 
true cyclic element of H’ into the corresponding true cyclic element C; of C 
(namely the one containing the same points of (K +a-+5))) in a non- 
alternating manner. 

Let H’; be any true cyclic element of H’ and suppose that W(H’;) = Ci. 
Then, by (W, 1.41), since W is non-alternating on H’; and C; is cyclic, it 
follows that for no x in C; does W(x) separate H’;. Also, by definition of 
W, the images of the end points of each free arc of H’; are distinct. Thus W 
is monotone on the closure of each free arc of H’;. 

Let axb be any simple are joining a and b in H’. Then W is monotone on 
arb. To prove this it is sufficient to show that if p and q be any two points 
on arb such that W(p) = W(q), and if z be any point between p and q on 
arb, then W(z) = W(p) = W(q). Since W is the identity transforma- 
tion on K, p and qg cannot both lie in K, so we may assume that p lies on a 
free arc of some true cyclic element H’; of H’. Then if q lies in H’; our 
assertion is established since W is monotone on the closure of each free arc 
of this set. If q is not in H’;, then, since W maps disjoint cyclic elements of 
H’ into disjoint cyclic elements of C, it follows that q lies in some H’,, where 
H’;- H’,=y, a single point (which may or may not be q). Let pyq be a 
simple are joining p to gq in H’; + H’;. Then, since W is monotone on the 
closure of each free arc of both H’; and H’%,, we have W(pyq) = W(y), so 
that in particular, W(z) = W(y) = W(p) =W(q), which proves W 
monotone on azxb. 

By definition the transformation Z(//) = H’ is monotone on the closure 
of each free arc of H, hence by (W, 2.2), if we define a transformation 
I’ = WZ it follows that 7 is monotone on the closure of each free arc of H. 


580 DICK WICK HALL. 


By definition we have T(H) = W(H’) =(, so that the lemma will be estab- 
lished if we show that T is non-alternating on H. 

The set ZH is locally connected, hence by (W, 1.5), in order to show that 
T (i) =C is non-alternating, it is sufficient to show that for each point ¢ «( 
and each component K of C —q, the set T-*(K) is connected. Letting q be 
any point of C’, two cases must be considered. 

Case 1. q is a cut point of C. 

Then gq is distinct from both a and b. Recalling that by definition 


4 
H => a’zxb’, it follows that Bj =T(a’x;b’) =azjb is a simple arc since T 
1 


is monotone on the closure of each of the simple ares a’x;b’. Since q cuts C it 


lies on all the ares B;, so we may write B; = azrjc qyib. Then since 
1! 


4 4 4 
C=> Bi, we have (aviq—q) +3 (qyib —q) =™,4+ 


Then M, and M, are both connected and closed in C—gq. Consequently, 


since C —q is disconnected, these sets are mutually separated, and hence 
components of CU — q. 

Let a transformation 7’; be defined as identical with T on a’ax;b’, and 
undefined elsewhere. Then 7; is monotone and T(a’x;b’ = B;. Thus for 
every the sets (avig —q) and T;-'(qyib —q) are connected. But these 
are precisely the sets 
(1) T'(axig —q) - (a’xib’), 

(2) (qyib — q) (a’xib’), 


respectively, so that the sets (1) and (2) are connected for all 7. Thus 
a’x,b’ — = T+ — q) (a’xib’) + T(qyib — q) - 


and therefore, 
4 4 4 

1 1 1 


so that 


4 4 
H—T"(q) = XT (axig —q) (a’xib’) + ST*(qyib —q) 
1 1 
= N, N2, 


and from (1) and (2), it follows that N, and N, are connected. They are 
evidently disjoint and closed in their sum, so they are mutually separated. 


4 
For any point xe N,, we have T(r) « S (arig — q) C M,, so that 
T-'(M,), hence N,C T+(M,), and similarly N.C T-*(M,). 


De 


fol 
N2 
Ca 
Ot 
| 
By 
cor 
so 
Tl 
| is 
| th 
| . 
Si | 
wl 
ha 
ne 
| th: 
Bw 


n 


NON-ALTERNATING IMAGES OF LINEAR GRAPHS. 581 


Moreover, for any point we T-'(M,) we have T(x) e€ M, so that 
T(x) none M,. Consequently, since N.C T-'(M_), we have x non e N2; there- 
fore ze N,, so that T*(M,) CN, Thus N,—T-'(M,), and similarly 
N. = T"(M.), so both of these sets are connected and the lemma follows for 


Case 1. 


Case 2. qg is a non-cut point of C. 

We have again Bj = T(a’x;b’). If qe By write Bi = = axig + qyid. 
Otherwise, write Bj; —axib. Let the a’x;b’ be numbered so that for some 
integer j, ge Bi, (A SiSjJ),qnone Bi, (01> 7). (If qe for every 1, then 
we have, of course, 7 = 4 since H contains but four free arcs by hypothesis). 


j= 
Then 
4 j j 
+> (avig—q) +E (qyib — 9). 
Define 


j j 
M => T"(axiqg — q) - N => T"(qyib —q) (a’xib’), 
1 


1 
4 
Z = > - (a’x,b’). 
By the reasoning used in Case 1, each of the sets WM, N, and Z is vacuous or 
connected. 

Assume, first, that Z is non-vacuous. Then q is distinct from both a and b 
so that M contains a’, N contains b’, and Z contains both of these points. 
Therefore, since each of these sets is connected we have that (M+ N+ Z) 
is a connected set. 

Secondly, if Z is vacuous, then g occurs on every simple are By. If q=—a, 
then M = 0, and both N and Z contain 6’, so that (M + N + Z) is connected. 
Similarly, if ¢g = b, then N = 0, both M and Z contain a’, and, consequently, 
(M+ N+ Z) is connected. If q is distinct from both a and b, then, since 
q does not cut C, there exist two simple arcs B; and B, in C and a point p in C 
which precedes q on the are B; but follows it on the are Bj. Then M and N 
have 7-*(p) in common so that (Jf +N) and hence (M + N +Z) is con- 
hected. Therefore, in every event, (M+ N + Z) is a connected set. 

But, since C = T(a’x;b’), we have that T-1(C —q) = M+N-+2Z, so 
that this set is connected. Therefore, for any point q of C and every com- 
ponent K of C — q, the set T-1(K) is connected. Consequently, by (W, 1.5), 
the transformation 7'(77) = C is non-alternating, as was to be proved. 


Lemma 4. Suppose that A, C, and A+C are connected and that 
A-C =p,a single point. Let T’(A) = Band T?(C) =D be non-alternating 
10 


582 DICK WICK HALL. 


transformations such that T’ =T? on AC. Define a transformation T = T’ 
on A, T =—T? on C. Then a necessary and sufficient condition that T be non- 
alternating on (A+(C) is that B-D =T(p). 


Proof. The condition is clearly necessary. Otherwise, the inverse of some 
point z of (B + D) — p would intersect both A — p and C — p, so that this 
inverse would be separated by 7-'7'(p) in (A -+C) which would make T 
alternate on this set. 

The condition is also sufficient. Otherwise, there exist two points y’, y”’ in 
(A+C), with T(7) =T(y’), and a point in B+ D such that T-1(z) 
separates 7’ from 7” in (A+C). Then (i) if y and y” both lie in A or C, 
we have a contradiction to the fact that 7’ is non-alternating on these respective 
sets; (ii) if y «A, eC, or conversely, then, since T(y’) =T(y’) and 
B- D=p, we have T(y’) =T(y’) =T(p), and thus pnone T(x). Hence, 
by the definition of T, T-*(x) cannot separate either 7 or y” from p; so that 
it cannot separate 7 from y”, and the lemma follows. 


Lemma 5. If C=} A; be connected, where n is finite and each A; is a 
1 


connected A-set (in some locally connected continuum 8) which is the non- 
alternating image of a linear graph and where for no t, 7 (ij) does Aj: Aj 
contain more than one point, then C is the non-alternating image of a linear 


graph. 


Proof. By hypothesis there exists a set of disjoint linear graphs H; 
(i =1,2,- --,n) and aset of non-alternating transformations 7; (H;) = Ai. 

Consider the A-set A,. Since C is connected, at least one of the other sets 
Ai, say As, must intersect it. Then A,- As = py, a single point by hypothesis. 
Let 9: be any two points of 7,*(p,2) and respectively, and 
translate H, until it has the single point p, = 4, in common with H,. Then, 
by Lemma 5, the transformation 7, which is 7, on H, and T, on H, is 
non-alternating, and 7',.(H, + =A; -+ 

Repeating this process we may add on one A-set at a time, and at each 


k 
stage secure a non-alternating transformation sending > H;, for example, 
1 


k 
into > A; at the k-th stage. It follows at once that A; cannot have two points 
k-1 
in common with 5 Aj, since if p and g were any two such points they would, 
1 
by hypothesis, lie in different A-sets of the above sum. Then, using the fact 
that each of the A-sets is connected and locally connected, we could construct 
a simple are joining p and q and not lying wholly in A;, which is contradictory 


sim 


$a 
of 
ar 
si 
tai 
dis 
Si 
K 
sl 
las 
Th 
i the 
| 
| 
fol 
ims 
sin 
of 
| bot 
two 
to 
jou 
i 
tha 
- 


NON-ALTERNATING IMAGES OF LINEAR GRAPHS. 583 


to the definition of an A-set. Thus the extension may be made in exactly the 
same way at every stage and the lemma follows. 


LemMMA 6. LHvery simple cyclic chain which is the sum of a finite number 
of simple arcs is the non-alternating image of a linear graph. 


Proof. Let C(a,b) be the simple cyclic chain which is the sum of 7 simple 
ares by hypothesis, and let a; (1 =1,2,- --,27) be the endpoints of these 
simple arcs. Then we may find 27 cyclic elements of C(a,b) whose sum con- 
tains all the points a;. Let m be the number of these cyclic elements which are 
distinct, and number them K; in the order in which they occur from a to b. 
Since C(a,b) is a simple cyclic chain, let it be expressed in the form 
C=(K+a+ b)+30Ci, where the C; are its true cyclic elements, and 
K consists of all those points separating a and b in C(a,b). Let arb be any 
simple arc joining a and 6 in C(a, b), and let 2, y; be respectively its first and 
last intersections with the set K;. Evidently 7—y; if Ki is degenerate. 
Then 2; =a, Yn = 0, while all of the points 2;, y; which are distinct from 
these two are points of K. Consequently, the choice of the points 24, y; is 
independent of the arc axb we used. Define Moi) =C (xi, yi), Moi = C (Yi, Vist) 
as simple cyclic chains in C. Some of the sets M; may be degenerate. 

Evidently the chains M; as constructed are finite in number and 
C(a,b) = > M;. Moreover, these sets are A-sets, by definition, and for any 
44 (ix i), Me -M; contains not more than one point. Our lemma will thus 
follow by Lemma 6 if we show that, for every i, M; is the non-alternating 
image of a linear graph. This follows at once for every M.;-, by Theorem II, 
since every such set is a K;, hence cyclic, and is the sum of a finite number 
of simple arcs, since C is. Now M,, is a cyclic chain joining y; to 2;,,, and 
both these points belong to K. Moreover, with the possible exception of these 
two points, M.; cannot contain end points of any of the simple arcs which go 
to make up C'(a,b) ; whence M.,; is the sum of a finite number of simple arcs 
joining y; and 2;,,. The lemma is then an immediate consequence of Lemma 3. 


THEOREM III. Every curve C which is the sum of a finite number of 
simple arcs is the non-alternating image of a linear graph. 


Proof. Let C be the sum of j simple arcs. Then these arcs have not more 
than 27 distinct end points; whence ( contains not more than 2 7 nodes,’ since 
every node of C must evidently contain an end point of at least one of the 7 


"See G. T. Whyburn, American Journal of Mathematics, vol. 50 (1928), p. 178. 


| 

| 


584 DICK WICK HALL. 


simple ares of which C is the sum. Let there be h nodes in C and call them F; 
(i=1,2,---,h). Let p; be any non-cut point contained in £;. Define 
M,=C(p,, po). Let pyxps be any simple arc joining p, to p; and let qs be the 
last intersection of this arc with M.. Define M; —C(ps,q3). In general, if 
My. has been defined let p,vjp, be a simple arc joining p, to px in C and let % 


k-1 
be its last intersection with }} M;. Define Mi =C (px, qx). 
h 2 

Evidently, C= > Mi, and each M; is the non-alternating image of a 
linear graph by virtue of Lemma 6. Moreover, the conditions of Lemma 5 are 
obviously fulfilled so that the theorem follows by virtue of that lemma. 

The converse of Theorem III is false. For, let D be any dendrite which 
is not the sum of a finite number of simple arcs (and such a dendrite may 
easily be constructed). Then, by (W, p. 301), D is a boundary curve, and 
hence, by (W, 4.6), D is the non-alternating image of a circle, which is cer- 
tainly a linear graph. Therefore, the non-alternating image of a circle need 
not be the sum of a finite number of simple arcs. 


UNIVERSITY OF VIRGINIA. 


by 
eq 
pa 


80] 


80 
pa 
T 
(1 
in 
A 

mn 
an 
li 
sa 
fo 
a | 
(1 
an 
di 
(1 
if 


REPRESENTATIONS IN CERTAIN PURE FORMS OF DEGREES 
HIGHER THAN THE SECOND.* 


By E. T. BE Lt. 


1. Introduction. The method of Lagrange’ for finding parametric 
solutions of certain diophantine equations may be taken as the point of de- 
parture for obtaining representations of integers in special forms of degree = 2. 
This will be illustrated by starting from the equation 


(1.1) a® + b? + pe? = 0, pabc 0, 


in which p is a constant integer. Numerical examples are given in § 5. 
Another method is indicated in § 6, with examples. Throughout the paper, 
L,Y, 2, U, v, w denote real or complex variables, other small letters, rational 
integers. Without loss of generality a, b may be taken coprime, (a,b) = 1, 
and any divisor 7* of p may be absorbed in ¢ if desired. It was shown by 
Lucas * that if {d} denotes d divided by the greatest cube divisor of d, a neces- 
sary and sufficient condition that (1.1) have a solution is that p be of the 
form {st(s + ¢)}. 

If (a, b, ¢) = (an, bn, Cn) is any solution of (1.1), Cnsr) 18 also 


a solution, where 


(1. 2) An 2b,°), Dn (2an° + bn*), Cn+i = Cn(4n® — bn*), 


and if (ns1, Onis, Cnsi) KF (kan, kbn, ken), the two solutions are said to be 


distinct. A parametric solution of (1.1), due to Lucas, is 


(1.3) + + 32y7, + + bay’, 
+ cy + y’), 


C 


if a, b, c, p are integers, x, y run through all integers. By means of (1.3), 

* Received April 30, 1937. 

* Supplement to Euler’s Algebra; see also R. D. Carmichael’s Diophantine Analysis. 

*E. Lucas, American Journal of Mathematics, vol. 2 (1879), p. 184. The paper 
by L. Holzer in Journal fiir Mathematik, vol. 159 (1928), pp. 93-100, discusses the same 
equation, and obtains (incidentally) some of the results of Lucas and Sylvester, whose 
papers on the subject appear to have been overlooked by the author. 

* Lucas asserts, loc. cit., p. 185, that if |a| ~ ||, there is an infinity of distinct 
solutions. This has not been proved. 

585 


586 E. T. BELL. 


Lucas solved (in particular) (1.1) with p= 6, which Legendre had mis- 
takenly asserted to be unsolvable in integers. 


2. Identities from (1.1). Let (a,b,c) be any solution of (1.1), with- 
out the conditions (a,b) =1, | c | free of cube divisors > 1, so that 


(2. 1) a® + b* + = 0, pabc 0. 


We shall determine the parameters w, x, y, z so that, identically in w, 2, y, z, 


(2.2) (aw+2)* + (bw +y)* + p(cw + 2)*==3Aw + B, 
with A, B independent of w. 

Lemma 1. If 2, y, z are such that 
(2. 3) + b*y + = 0, 


where (a,b,c) is a particular solution of (2.1), then identically in w, a, y, 2, 


(2.4) pet[ (aw + + (bw + y)® + p(cw + 2)*] 
— (bx — (3abcw + ber + cay + abz). 


For, from (2.1), (2.2) we get (2.3) and A = aa? + by? + per’, 
B=z' + y* + pz*; whence, eliminating z by (2.3) we have 


— = ab(br — ay)’, 
= (bx — ay)?[b(2a* + + a(a® + 26%) y]. 


Hence, if (2. 2) be multipled throughout by p’c°, the right becomes (ba — ay)’F, 


F =— [3abpc*w — (2a* + b*) ba — (a® + 26° )ay]. 
F = — [3abpc*w — (a* — pc*) ba — (b* — pc*)ay], 
= — [pc*(3abw + br + ay) — ab(a*x + by) ], 

= — pe? (3abcw + bex + cay + abz) ; 


which completes the proof of (2.4). 


Lemma 2. If, without loss of generality, (a,b) =1 in (2.1), so that 
(integers) r, s may be determined such that 


(2. 5) a’r + b6’s =—1], 
then, identically in w, u, v, 


(2.6) (aw + pre?u + b?v)* + (bw + psc?u — a*v)* + p(cw + u)? 
= — pM*(3abew + R), 


id 


ar 


( 
al 
ac 
(; 
R 
(¢ 
a 
= 
(i 
ar 
(3 


REPRESENTATIONS IN CERTAIN PURE FORMS. 587 
where M = (br —as)u— cv, 
R= (prbc® + psac® + ab)u + (b% —a*)v, 
For, from (2.5), (2.3) we get 
(x, y, 2) = (pre*’u + psc?u — 


with (u,v) arbitrary, and the result follows by (2.1), (2.4). 
In (2.6) we now choose (u,v) = (m,n), m n integers, and reduce the 
right numerically. If [y] is the greatest integer in y, we write 


| B | 
3 | abc | 


(2.7) n = sgn R, e= sen (abcR) 


after having replaced (u,v) by (m,n) in R&, where sgn y denotes 1,0,—1 
according as y > 0, y=0,y¥ <0. Then 


(2. 8) |R|=—3|abe|G+p, 0=p<3|abc|. 
Replacing w by w— eG in 3 abcw + R we find 3abcew + np. 
LEMMA 3. With G, p as in (2.7), (2.8), 


(aw + pre?m + b?n — eaG)*® + (bw + psc?m — a?n — ebG)® 


+ p(cw + m — eG)* =— pM? (3abcw + np), 
identically in w, where 


M = (br —as)m — cn, 
and a, b, c, r, s are as in Sinai 2. 
Returning to (2.6) we recall that (r,s) are defined by (2.4). Let 
(2.9) (br — as, c) 
Then integers h, & may be found such that 
(2.10) (br —as)h — ck =g, 
and (2.12) is a solution of 


(2.11) (br —as)u— cv = 


(2. 12) u=hre+cy, v + (br—as)y. 


588 E. T. BELL. 
LemMA 4. With r,s, 9, h, k as in (2.4), (2.9), (2.10), 
identically in w, x, y, where 


X =aw + (pre*h + b’k)x + ay, 
Y = bw + (psc*h — a*k)ax + by, 
Z=cw+hr-+ cy, 
R= [h(prbc® + psac* + ab) + k(b* —a’*) 
+ [e(prbc® + psac* + ab) + (br —as) }y, 


and a, b, c are as in Lemma 2. 


As in Lemma 2, # in Lemma 4 may be reduced numerically when gz, 


are integers. Write 


(2. 13) p=h(prbc® + psac® + ab) + —a?) 
o =c(prbc® + psac*® + ab) + (br—as) — a’) ; 


p, o are integers, and in Lemma 4, F = px + oy. 
3. Consequences of Lemmas 3, 4. In Lemma 3 take w=0. Then 


THEOREM 1. With M as in Lemma 3, every + ppM? (integer) is of the 
form a + B® + py’, with a, B, y integers, and with at most 3 exceptions 
M 0, all of «, B, y may be chosen 0. 


In Lemma 4 take w=0;y¥=—0. Then 
THEOREM 2. Hvery +pg’ox*y, and every + 3abcpg?x*w is of the form 
xe Y3 p(Z g°pU*), 


with o, was in (2.13), g as in (2.9). If x, y, w are integers, X, Y, Z, U may 
be chosen integers, and in each case with at most 4 exceptions, all different 


from zero. 


4. Further consequences of (2.3). Returning to Lemma 1, we now 
find common solutions (2, y,z) of (2.3) and 


(4.1) ba — ay = hpe. 


Assuming (without loss of generality) as before that (a,b) =1, we can find 


f, g such that 


id 


in 


a 
a 
( 
f, 
( 
(: 


REPRESENTATIONS IN CERTAIN PURE FORMS. 589 


(4. 2) bf —ag = 1. 
Then the solution of (4.1) is 


(2, y) = (hpef + ka, hpeg + kb), 
and this will give z an integer in (2.3) provided 


pe? | (hpef + ka), pe? | (hpeg + kb). 
Hence 


p\|k, k=tp; h=mce, t—nc?, 
and a common solution of (2.3), (4.1) is 


(4.3) x = pe*(an + fv), y = pe? (bn + gv), 
z= pe*n — (a*f + 


Replacing w by w— pe*n in (2.4) and reducing the result we find 


Lemma 5. If (a,b,c) ts any solution of (2.1) with (a,b) =1, and 
f, g are determined by (4.2), then, identically in w, v 


(4.4) (aw + pe°fv)*® + (bw + pe?gv)*® + — (a*f + b?g)v]? 
= — pv? |[3abew — {a* — (38ag + 1) pc*}v]. 


For numerical reductions of (4.4) we write 


(4. 5) A = 3abe, B= (3ag + 1) pe? — a’, 
|B|=—=Q|A|+2, 0= 4, | Al; 
sen (AB) =.«, sen B=». 


LemMA 6. With the notations of Lemma 5 and (4.5), 
(aw + av)* + (bw + Bv)* + p(ew + yr)? =— pv? (Aw + Rv). 
identically in w, v, where 


a==perf—e, B=pe'g—ebQ, 


y=af+ b’¢ + 
Since every integer n is of the form rs? in at least one way we have 
THEOREM 3. LHvery 8abcpn is of the form 

+ + p(y* + 


with the notation as in Lemma 6; and with at most 3 exceptions n, all the 
mtegers a, B, y, 8 may be chosen ~€ 0. 


590 E. T. BELL. 


THEOREM 4. In the statement of Theorerm 2, x*y and xz?w may be 
replaced by n. 


5. Identities with fourth powers. By integrating the identities in the 
preceding lemmas with respect to the parameters between suitable limits we 
ascend from identities involving cubes to others involving fourth powers. It 
will be sufficient to illustrate the general process for Lemma 6; the actual 
ascent in numerical examples is most readily made directly from the examples, 
Integrating with respect to w between the limits 0 and w we find 


Lemma 7%. Identically in w, v, 


bc(aw + av)* + ca(bw + Bv)* + pab(cw + yv)* 
— (bca* +- caB* + paby*) v* =— 2pv?w(Aw + 


the notation being as in Lemma 6. 
Integration with respect to v between the limits 0 and v gives 


Lemma 8. Identically in wu, v, 


3[ By (aw + av)* + y2(bw + Bv)* + paB(cw + yv)* 
— (Byat + yab* + paBe*)w*] 
= — pr*®(4Aw + 3nkv), 


the notation being as in Lemma 6. 


For w =1 or v = 1 the last two give theorems for fourth powers similar 
to those for cubes. 


5. Numerical examples. An indefinite number of special results are 
furnished by the preceding lemmas and theorems for particular solutions of 
(2.1). It will suffice to illustrate Lemma 6 (a further, more systematic 
solution from the numerous results on hand will be given on another occasion). 
The obvious solution (a, b,c, p) = (a,b, —1, a® + b*) gives some interesting 
results for various a, b. 

The choice (a, b,c, p) = (1,1,—-1, 2), (f, 9) = (1,0), gives A = —3, 


B=—3, Q=—1, R=0, 1, 1, B=—1, y=0. Hence 
(by Lemma 6), 
(5. 1) (w+ v)* + (w—v)* — = 6wr’, 


a well known identity. 


a, 


nc 


| (5 

Wi 

| ( 

m 

( 

A 

Y 

( 

( 

U 

4 

f 

( 


be 


REPRESENTATIONS IN CERTAIN PURE FORMS. 591 


(5.2) Every 6n is of the form a® + b* — 2c°, and if n> 1, we may choose 
a,b,c > 0. 


The iteration (1.2) applied to (a,b,c) =(1,1,—1) gives 
(a, b,c) = (3, — 3, 0), 
not a solution of (2.1) since here pabe = 0. For 
(a, b,c, p) = (2, —1,—1, 7) 


we take (f,g) =(—3,1) and find B=—5?7, Q=9, R=3B, 
e=—1l,v=—1, 3, B=—2,y=—2: 


(5.3) (2w —3v)® — (w + 2v)*— — = — 2102(2w — v). 
Replace w by —w. Then 


(5.4) Every 42n is of the form a* — b* + %(c* — 3d*), and if n> 2, we 
may take a,b,c,d > 0. 


Iteration as in (1.2) of (a,b,c) = (2,—1,—1) gives the new solution 
(a,b,c) = (4,5, — 8) of (1.2) with p—7. For this we find (f,g) = (1,1), 
A=— 180, B=— 2521, Q=14, 1, a= 7, B=—%, 


Y= 
(5.5) (4w-+ + (5w — %v)* — 7 (38w — v)* = %W?(180w v). 


(5.6) Hvery 1260n is of the form a* + b®*—7%(c® +d’), and if n>1, 
we may take a,b,c,d > 0; by (5.4) every 1260n is also of the form 


a® — — 3d*) 
with a, b,c, d > 0. 


In the same way we find the following. From (a,b,c, p) = (2,1,—1,9), 
(5. 7) (2w + 3v)* + (w— 3v)* — 9(w + v)* = 9v? (6w — v) ; 
from (a,b,c, p) = (2, 8, — 1, 35), 
(5.8) (2w + %v)? + (3w — %v)* — 35(w — v)*® = 850? (18w + v) ; 


from (a, b,c, p) = (3, — 2, —1,19), 


the 
we 
It 
ual 
eS, 
), 
e 


592 E. T. BELL. 

(5.9) (2w + 5v)* — (8w — 2v)* + 19(w — = (18w — v) ;s 
from (a, b,c, p) = (5, — 4, — 1, 61), 

(5.10) (4w + 13v)* — (Sw + v)* + 61(w — 3v)* = 1830? (20w + 3v). 


From (5.7)-—(5.10) we write down the results corresponding to (5.6), 
etc. Thus from (5.10), 


(5.11) Every 3660n is of the form a* — b* + 61(c* — 9d*), and if n > 3, 
we may take a,b,c,d > 0; every 183(20n + 3) is of the form a* — b* + 61c’, 
and if n > 3, a,b,c > 0. 


One example of § 4 will suffice. Integrating (5.7) with respect to v 
between 0 and v we get 


(5.12) (2w + 3v)*— (w—3v)* + + 9v* — 9(w + v)*] = 


(5.13) Every 27(8n—1) is of the form a*t—b* + 3(4c* — 9d*), with 
a,b,c,d >0 tf n>3. 


We give some miscellaneous examples, illustrative of general devices. 
Taking v = + (1, 2,3) in 


(20w — 3v)* — — 3v)* — 9(7w — v)* = 9v(6w — v)?, 


obtained by the preceding methods, we get 


(5.14) (20w — 3)* — (1%7w — 3)* — 9(7w — 1)* = 9(6w —1)?, 
8(10w — 3)* — (17w —6)* — 9(Tw — = 72(38w — 1)’, 
(20w — 9)* — — 9)* — 9(7u — 3)* = 243 (2w — 1)?, 
64(5w — 3)* — — 12)* — 9(7w — = 144(3w — 2)’, 
125(4w — 3)* — (17w — 15)* — 9(7w — 5)* = 45 (6w — 5)?, 
8(10w — 9)* — (17w — 18)* — 9(7w — 6)* = 1944(w —1)?. 


Hence, for example, every 72(3n —1)?* is of the form 8a* — b® — 9c*, with 
a,b,c >0 if n>0. An interesting specimen of this kind, from another 
identity, is 


(5.15) Every 168n? is of the form a® + 8b* — %c*, with a,b,c > 0 if n >%, 
and every 21(2n 1)? is of the form a® + — with a,b,c > 0 if 


I 

I 

( 

( 

| a 

a 

| 

| 


REPRESENTATIONS IN CERTAIN PURE FORMS. 
Another kind is illustrated from the pair 


(2w + 3v)* + (w— 3v)* —9(w + v)* = 9v? (6w — v), 
(2w — + (8w + — 85(w + v)* = (18w — v). 


In the first replace w by 3w and subtract from the second. Then 


35[3(2w + v)® + 3(w—v)*— (3w + 0)* + (w+ 0)*] 
=(2w — 7v)* +(3w + 


In this we now make any term, say (3w + 7v)*, equal to z*. Hence (in this 
case) w= — 2a — Tu, v= 2 + 3u, and we get 


(5.16) = (11x + 35u)* + 35[ (5a + 18u)? 
— 4u)* — 3(38x + 10u)* — + 11u)*]; 


(5.17) Every n° is of the form a* + 35(b* — c* — 3d* — 8e*), and if n 0, 
all of a,- - -,e may be chosen > 0 in an infinity of ways. 


Integration of (5.16) gives 


(5.18) = (11x + 35u)* + (5a + 18u)* + 224u*] 
— 385[ (x + 4u)* + (3a + 10u)* + (32 + 11u)*]; 


and hence, on replacing u by 11u, 


(5.19) 1331(2 + 35u)* + 7[ (5c + 198u)* + 42592u*] 
35[ (a + 44u)* + (82 + 110u)* 4+ (32 4 121u)*]; 


(5.20) is of the form 
1331a* + 7(b* + 42592c*) — 35(d* + e* + f*), 


and all of a,- - -,f may be chosen > 0 in an infinity of ways if nA 0. 
Differentiation of (5.16) with respect to x gives 


(5.21) a? =11(112 4+ 35u)? + 35[5 (52 + 18u)? 
— (4 + 4u)? — + 10u)? — 9(3¢ + 11u)?*]; 


(5.22) Every n* is of the form 11a? + 35(5b? — c? — 9d? — 9e?), and if 
n=), all of a,- - -,e may be chosen > 0 in an infinity of ways. 


Differentiating (5.16) with respect to u, and replacing « by 16w — 35v, 
u by — 5w + 11v in the result, gives 


); 


594 E. T. BELL. 


(5.23) 4(4w — 9v)? + 3[10(2w — 5v)? 
+ 11(%7w — 16u)? — 6(10w — 23v)?]; 


(5.24) Hvery n? is of the form 4a* + 3(10b? + 11c? — 6d?), and if n0, 
all of a,- - -,d may be chosen > 0 in an infinity of ways. 


Differentiation of the last of (5.14) gives 


(5.25) 80(10w + 1)? —17(1%7w — 1)? — 63(7w + = 12960; 
(5.26) Every 1296n is of the form 80a* —1%b? — 63c?, with a,b,c > 0. 


A general result of the last type follows from Lemma 6, with the notation 
as there: 


(5.27%) Hvery — pabev? is of the form 
a(aw + av)? + b(bw + Bv)? + po(ow + 
An example of Lemma 2 with c ~ — 1 is | 
(5. 28) 17(%w + 107%v)* + (w — 31v)*®§—(18w + 275v)* = 1703 (37%78w — 5dv). 
The substitution w = 82 + 55y, v = — 55a + 378y transforms (5. 28) 
into 


(149812 — 104940y)* + (17132 — 11663y)* — 17(58292 — 40831y)° 
= — 378y)?; 


hence every 17(378n — 55)? is of the form a* + —1%c*, and if n> 7, 
all of a, b, c may be chosen > 0. 


6. Second method. This can be applied to any number of terms, here 
illustrated for 3. Let a,- - -,y be such that 


(6.1) abc 0. 
Then, identically in z, y, 


(6. 2) (ax + ay) + (bx + By) + (cx + yy) =0. 


Two integrations of (6.2) with respect to 2 between the limits 0, x give 


and 


b?c? 


whe 


uy 


we 


and 
last 
the 
is D 
cho 
He 
tha 
|: 
= 


Or 


REPRESENTATIONS IN CERTAIN PURE FORMS. 59 


(6.3) + ay)® + + + ab? (ca + yy)? 
== y"|3abc(bcea? + + aby*)a + + + a?b*y*)y] 


and it is clear that the restriction abe ~0 in (6.1) may be suppressed. 
A simple reduction by (6.1) gives 


bea? + cap* + aby? = — (ba —ap)*, 
b?cta® + + a*b*y* (ba — ap)*[a(a + 2b) + (2a + b)a], 


and we have 


Lemma 9. Identically in y, 


(ax + ay)® + + By)* + a°b*(cx + yy)® 
= — (ba —aB)*y?[3abcx — {b(2a + b)a + a(a + 20) 


where a,- + -,y are such that 


If (a,b) =d, and (a, b,c) = d(m, bi, ¢,), the identity resulting from the 
last has (a, b, c, x) replaced by (a, 61, ¢;, dx), or, dropping suffixes, we recover 
the preceding identity with (a,b,c) 1 and z replaced by dz. Hence there 
is no loss in generality in assuming (a,b) 1 in Lemma 9. We can therefore 
choose f, g as in (4.2), and get as the solution of ba —aB—u, 


(6. 4) a= fu-+ av, B=gu+ bv. 
Hence, from Lemma 9, follows 


Lemma 10. If, without loss of generality, (a,b) = 1, and f, g are such 
that bf —ag =1, then, identically in w, z, 


(a + b)?[b?(aw + fz)* + a?(bw + gz)*] (a+ b)w + (f+ 9)z]* 
= 2"[3ab(a + b)w + {(a+ 2b)ag + (2a + b)bf}z]. 


In obtaining this, the following change of notation was made, z + vy = wv, 
uy =z. We give a few of the simplest examples. For 


(a, b, f,9) (3, — 2, 
we get 
(6.4) 9(2w + + 36w? — 4(3w + = 2?(18w + 5z) ; 


| 


596 BELL. 


(6.5) Every 18n + 5 is of the form 9a* + 36b* — 4c’, with all of a,b,c > 0 
ifn > 0. 


From (a, 6, f,g) = (4, 3,1,—1), 
(6.6) 16(3w +1)* + 144w* —9(4w + 1)? = +7; 


(6.7) 36n+ 7% is of the form 16a* + 144b* —9c*, with all of 
a,b,c>0ifn>0. 
(6.8) If (a,b) =1, bf —ag =1, every 


3ab(a+ b)n + {(a+ 2b)ag + (2a +b) bf} 
is of the form 


+ b)*A* + b?(a+ — 
and with at most 3 exceptions n, all of A, B, C may be chosen > 0. 
Differentiating the identity in Lemma 10 with respect to w we get 
(6.9) Hvery n? is of the form 
(a + b) (aA? + — abC*, 


where (a,b) =1, and all the integers A, B, C may be chosen ~ 0 in an 
infinity of ways. 


We have 
A =bm + gn, B=am -+ fn, =(a+b)m+ (f+ 
where m is an arbitrary integer, and f, g are as in Lemma 10. 


7. Simultaneous solutions of (2.1). Let (a,b,c) (a, B,y) be two dis- 
tinct solutions of (2.1). Then, identically in 2, y, 


(ax + ay)® + (ba + By)* + plex + yy)* = 3ry(Pa + Qy), 
Pata + pory, Q==aa® + dB + poy’. 
Write 


so that (a, 8, y) = (aA, — bB, cC) is the first iterate of (a, b,c) obtained by 
(1.2). A short reduction gives 


(7.2) P=0, = — ; 


(7.3) + Ay)* + (a — By)*® + pe? (x + = — 27% pad 


| 
t 
( 
i 
| | 
( 
| 
| 
a’ 
as 
| 


> () 


REPRESENTATIONS IN CERTAIN PURE FORMS. 597 


(7.4) If (a,b,c) is any solution of (2.1), every —2%pa*b'c*n, and every 
— 27 pa*b*c’n*, is of the form a*e* + b*f* + pe*g*, and with at most 3 excep- 
tions n, in each case, efg ~ 0. 


In (7.3) we take x = y and find 
(7.5) A two-fold infinity of solutions of 
a -++- -+- v*) w*) 


in integers @, y, 2, W, U, V Is 
+ 20°), u = z=1+a— 
y = — 2a* — B®), vy = b, w = 3ab, 
where a, b are arbitrary integers. 
Integrating (7.3) with respect to « from 0 to x, and reducing the con- 
stant of integration, we find (see (7.1)), 
(7.6) a®A* +- BA + = 27 
(7.7) a®(a+ Ay)* + (a — By)* + pe? (a + Cy)* + 27 3c? Dy* 


= — ; 
(7.8) If (a,b,c) ts any solution of (2.1), every 54pa*b®c'n? is of the form 
27 pa®b*c® — — — c®y*, with 4, By y > 0 
if nis different from — a* — 26°, 2a? + —a’. 


Integrating (7.7) with respect to y from 0 to y, and reducing as before, 
we find 
(7.9) aBC— + pc?AB = ; 


(7.10) a®BC(x + — (a — + pc?AB(a + Cy)? 
+ 27pa*b*c?A BC Dy’ -— 
= — 90pa*b*cP ABC ; 


(7.11) Every — 90pa*b*c®A BCn?2, and every 90pa*b®c? ABOn’, is of the form 
— + pc? ABy® + 9pa*b*c* (3A — the notation being 
as im (2.1), (7.1), with aBySe 40 if n A—A, B, —C for the first, and 
%Byde 0 for the second. 

11 


598 E. T. BELL. 


Integration of (7.10) with respect to x between 0 and z, and reduction 
of the constant of integration, gives 


(7.12) a*A®>— + pc®C® = 9pa*b*c® ABC ; 


(7.13) Ay)* + — By)* — pee? AB(a + Cy)® 
+ 3pa*b*c* — ABCy’) (x* — 3ABCy’) 
= 162pa*b*c® ABC Dzy’ ; 


(7.14) Hvery 162pa*b?c? ABCDn is of the form 
3pa*b*c* (38° — ABCe®) (8° — 3ABCe*) — a®BCa® + — 


the notation being as im (2.1), (7.1), and with at most 3 exceptions n, all of 
+,e may be chosen > 0. 


The processes of this section can obviously be continued indefinitely. From 
(7.12) we note that 


(7.15) For an infinity of integers p, 
vu’ + — = 
1s solvable in integers with rcyzuvw ~ 0. 


CALIFORNIA INSTITUTE OF TECHNOLOGY, 
PASADENA, CALIFORNIA. 


dy 
ph 
do 
m 
| is 
q sy 
(1 
| Ww 
| tr 
m 
| 
ty 
CE 
— 
te 
li 


ON THE NORMAL FORMS OF LINEAR CANONICAL 
TRANSFORMATIONS IN DYNAMICS. 


By JoHN WILLIAMSON. 


Let m be the number of degrees of freedom of a linear conservative 
dynamical system and let the point (1, q2,° P1y Po’ *> Pn) Of the 
phase space be denoted by = A system of 2n ordinary 
differential equations of the first order, which are homogeneous, linear and 
do not contain ¢ explicitly, is a canonical system, if, and only if, the differential 
equations can be written in the form 


where H is a real symmetric matrix of order 2n and @ is the skew symmetric 
trix 

ma _ 0 

transformation 


), and # the unit matrix of order n. A non-singular linear 


y = Ax 


is said to be a canonical transformation, if it transforms every linear canonical 
system into a linear canonical system. It is known that the transformation of 
matrix A is canonical if, and only if, 


(i) A’GA = sG 


where s is a constant. It can be assumed without loss of generality that ? 
s = 1 and accordingly we shall call a matrix A a canonical matrix if it satisfies 
(i) with s== + 1. 

In a previous paper * normal forms for dynamical systems under canonical 
transformations were found and here we determine normal forms for canonical 
matrices under canonical transformations. These normal forms are not com- 
pletely determined by the elementary divisors of the canonical matrix, so that 
two canonical matrices, which are similar, are not necessarily similar under a 


canonical transformation. 


1A, Wintner, “On the linear conservative dynamical systems,” Annali di mate- 
matica pura ed applicata, ser. 4, tomo 13 (1934-35), pp. 105-112. 

*E. R. van Kampen and A. Wintner, “On the canonical transformations of Hamil- 
tonian systems,” American Journal of Mathematics, vol. 58 (1936), pp. 851-863. 

* John Williamson, “On the algebraic problem concerning the normal forms of 
linear dynamical systems,” American Journal of Mathematics, vol. 58 (1936), pp. 141-163. 


599 


f 
dz 
G dt os Hz, 


600 J. WILLIAMSON. 


When considering the solutions of the equations of variation belonging 
to a periodic solution of conservative non-linear dynamical systems, the question 
of the occurrence of secular terms is known to depend on the elementary 
divisors of a canonical matrix.* In fact the degree of the highest secular term 
occurring is determined by the greatest exponent of the elementary divisors, 
For this reason, it is of interest, that it is possible for a canonical matrix to 
have an elementary divisor of order 2m (§ 6, Result ITI,).° 

In the following sections the problem is considered from a purely algebraic 
point of view and in section 1 is reduced to a simpler one of a similar nature; 
sections 2 and 3 are devoted to the proofs of preliminary lemmas, while the 
main results are obtained in the remaining sections. 


1. Simplification of the problem. Let / be the unit matrix of order n 
0 


and G the skew-symmetric matrix G = ( E 0 


) ot order 2n. A real matrix 


A; is said to be a canonical matriz, if 
(1) = G, 


where A’; is the transposed of A;. We shall be interested in determining 
necessary and sufficient conditions that two canonical matrices A, and A, be 
similar under a canonical transformation; in other words that there exist a 
third canonical matrix Ag, such that 


(2) A, = 


We first reduce this problem to a somewhat simpler one. 

If A, and A, are two canonical matrices, which are similar, and a matrix 
Q, to be specified later, is similar to A,, then Q is similar to A,. There exist, 
therefore, two non-singular matrices #, and R,, such that 


(3) =Q (4 == 1,2). 
The matrices 
(4) RiGR’, = 8; (i= 1,2), 


are skew symmetric and are left invariant by Q, that is, satisfy the equations, 


* A. Wintner, “ Three notes on characteristic exponents and equations of variation 
in celestial mechanics,” American Journal of Mathematics, vol. 58 (1931), pp. 605-625. 

*This result could be deduced by suitable modifications from papers by Alfred 
Loewy, “ Allgemeine bilineare Formen mit konjugirt imaginiren Variabeln,” Abhand- 
lungen der Kaiserlichen Leopoldinisch-Carolinischen Deutschen Akademie der Natur- 
forscher, Band 71. 8.8. 378-446, Halle (1898), and T. J. I’A. Bromwich, “ Canonical 
reduction of bilinear forms,” Proceedings of the London Mathematical Society, vol. 32 
(1900), pp. 321-332. 


W 


(5 
Tl 
the 
sy! 
A; 
H 
(6 
an 
| A 
| (" 
| 
| I 
| 
I 
| 
| 
6 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 601 
(5) OSiQ’ = 1, 2). 


Thus, if Q is any matrix similar to the two canonical matrices A, and Ao, 
there is associated with A, a skew symmetric matrix S, and with A, a skew 
symmetric matrix S,, both of which are left invariant by Q. 


We now prove 


THEOREM 1. A necessary and sufficient condition, that A, be similar to 
A, under a canonical transformation, is that there exist a non-singular matrix 


H, such that 


(6) HQ = QH, 

and that the two skew symmetric matrices S, and S», associated with A, and 
satisfy 

(7) HS,H’ = 


Proof. Let a matrix H satisfying (6) and (7) exist. Then 


A, = = by (3) and (6), 
= f,1HR,A,R, by (3), 
A;A,4;3", 


where A, = HR,. Further 


A;GA’, = R-“HR,GR’,H’ = by (4), 
— = G by (7) and (4). 


Hence A; is a canonical matrix. Conversely, if (2) is satisfied and A; is a 
canonical matrix, the matrix H = k,A;R," satisfies (6) and (7); for 


HQH" = = = = and 
HS8,H’ = = = RGR’, = S82. 


Since, in the above, Q is any matrix similar to A,, we are at liberty to 
choose Q in a suitable normal form. Then, if S is any real skew symmetric 
matrix satisfying the equation 


(8) — 8, 


we shall determine a normal form for S under transformations by matrices 
permutable with Q. If HQ =QH and HSH’ =S,, we shall call the trans- 


formation by the matrix H an admissible transformation and shall write S = S,. 


in g 
ion 
ary 
rm 
rs, 
to 
he 
n 
ix 
e 
4 


602 J. WILLIAMSON. 


2. Preliminary lemmas. When F£ is a square matrix of order m, we may 
consider # as a matrix of matrices and write 


(9) R= (Rij) (1,7 = 1, 2,- 
where is a matrix of r; rows and r; columns and + 7,2 + =m, 
If S is a second m-rowed square matrix and S is written as a matrix of matrices 
(10) S = (S8i;) (1,7 = 1, 2,- 


where S;; is also a matrix of r; rows and r; columns, we shall say that R and § 
are similarly partitioned or that (10) is a partition of S similar to that of R 
in (9). Ifin (9), when 7 is different from j, Ri; is the zero matrix, we shall 
call R a diagonal block matrix and write 
R Ro, Ret]. 

Lemma 1. If the matrices S,, S2 and Q satisfy .(5), then S, = MS,, 
where MQ = QM. 

Proof. Since S; and 8; are non-singular, Q is non-singular and accordingly 


(Q’)* = 817 QS, = 827° QS82, 
so that 
S.8,7Q QS.8,71. 


If M = then MQ = QM and S, = 


LemMA 2. If Q=[Q:,Q.2] and no latent root of Q, 1s the reciprocal 
of a latent root of Q2, a matrix 8, which satisfies (8), 1s of the form [S11, S22] 
and 


= Sii (1 = 1,2). 
Proof. Let 
S = (Sij) (1,7 = 1,2), 
be a partition of § similar to that of Q. Then 
(11) = (t,7 = 1,2). 


Since, by hypothesis, no latent root of Q, is the reciprocal of a latent root of @:, 
no latent root of Q; is the same as a latent root of (Q’2)-'. Therefore, as 4 
consequence of (11), Si, 0. Similarly and the lemma is proved. 


Lemma 3. Let Q=[Q:,Q2] and let S be a skew-symmetric matrit 
satisfying (8). If S= (Si) (1,7 =1, 2), is a partition of S similar to that 
of Q and, if S,, is non-singular, then 

S = &,, 
where = [Si1, T 22). 


| 
| 
| 
| 
i| 
i 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS, 603 


Proof. Let EH; be the unit matrix of the same order as Q; and H be the 
matrix 
i= z.) 5 
As a consequence of (11), 
Hence HQ = QH. Since S is skew-symmetric, 


where = S22 — 


3. Normal form of Q. Let p be a real number or else the two rowed 


real matrix 


a b 
p—(_5 1)» where b~0. 


Let EH; denote the unit matrix of order e; and U; the auxiliary unit matrix 
of the same order. The matrix, 


(12) P, = pK, + pui, 

has the single elementary divisor (A— p)* or the two elementary divisors 
(A—a-+ib)%, (A—a—v1b), according as p is a matrix of order one or 
two.’ The diagonal block matrix, 


(13) [P,,P.,-- Pe), 


has therefore the elementary divisors (A — p)“ or (A—a + 1b) %, (A —a—1b)% 
(j=1,2,:--,¢). We may take Q in the normal form, 


(14) Q [m, mx], 


where the matrix 7; is obtained from z in (13) by writing p; for p, e4; for e, 
and ¢; for ¢. Further pj ~ pi, if j is different from 1. 
If H is a matrix commutative with Q, H is a diagonal block matrix 


[H,,H.,- - -, Hx], where 
(15) A = 


*°Cf. Turnbull and Aitken, Canonical Matrices, p. 62. 

7 By the elementary divisors of a matrix A we mean the elementary divisors of 
A—nE. 

§ John Williamson, “ The idempotent and nilpotent elements of a matrix,” American 
Journal of Mathematics, vol. 58 (1936), p. 477. 


may 
), 
mM, 
ices 
dg 
il 


604 J. WILLIAMSON. 


But the form of a matrix H; satisfying (15) is known.* In fact, if Wr = rW 
and W = (Wi;) (1,7 =1,2,: - -,¢), is a partition of W similar to that of r 
in (13) and, if e, = e;, then 


Wy = and Wj; = (0, F ji), 


where F',; and Fj; are square matrices of order e;. Moreover Fy; and Fj; are 
both polynomials in U; with coefficients, which are polynomials in p. More 
exactly 


while 


ej-1 


1 
If 1 denotes the matrix ( and p is a two rowed matrix, 


0 
—l1 0 
p=a+ib and p=a—ib=p. 
With this notation (16) becomes 
(17) fused". 


a-0 


Let 7; be the counter unit matrix of order e;. Then 


(18) T;U0; = U';T; 
and as a consequence of (17) 
(19) T = F,;T;. 


Lemma 4. Let TW’; = Wal: If = Cj, Wi; = Wij. If Ci > 
the element in the first row and first column of Wi; is zero. 

Proof. Let e Then 

Tj = (Tj = (FiiT,9) by (19), 
== (0, 

Hence W;; = (0, ;;) and the lemma is proved. 

4, Reduction of S. Let 
(20) Q=[0:, Qe), 


where no latent root of Q, has absolute value 1, each latent of Q. is equal to 1, 
each latent root of Q; is equal to — 1 and each latent roof of Qj, 7 > 3, is equal 
to aj + ib;, where a;? + b;? =1 and a; + 1b; 4a, + 1b, unless r = 7. Then, 


If 


H 


W 


if 

(2 
ble 
ab 
is 
| 
| 
| 
| n 
| u 
| t 
| 

i 
( 
| 


W 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 605 


if § is a skew-symmetric matrix satisfying (8), as a consequence of Lemma 2, 
S2,- >, Sx], where 


(21) = Sj, (7 =1,2,---,k). 


Since any matrix 7, commutative with the matrix Q in (20), is also a diagonal 
block matrix, we may consider each of the equations (21) separately. Since 
8, is non-singular, @, is similar to (Q’,)~! and, since no latent root of Q, has 
absolute value 1, Q, is similar to a matrix [F,, (F’,)~*], where the order of F, 
is one-half that of Q;. Hence Q; may be replaced by the matrix [F,, (¥’1)~]. 
It is now a consequence of (21) that 


» Where of, = Fyo. 
—o 0 


lf H, = ( I ) where J, is the unit matrix of the same order as F,, 
1 


( 
H[F,, = [Fi and =( : 


Hence we have 

Result I. The matrix Q, may be taken in the form Z, = [/1, (#"1)"]. 
With this value of Q,, S; = G4. 

The matrix F’, is not unique and may be replaced by any matrix similar 
to it; in fact #, may be taken in the normal form [7, 72,- - -,a7], where 
m; is defined by (13) and | p;j|341. As a consequence of the above and 
Theorem 1 we have 


THEOREM 2. Jf A, is a canonical matrix similar to a second canonical 
matria A, and, if no latent root of A, is of absolute value 1, A; is similar to Az 


under a canonical transformation. 


We next consider equations (21), when j = 2, and for simplicity of nota- 
tion temporarily drop the suffix 7. The matrix Q = Q; is therefore of the form 


(22) Q = [P,, P2,° 
where P; is defined by (12) with the added restriction that | p|—=1. Hence 
pis a real orthogonal matrix of order one or two. If 

S = (9;;), (1,7 = 1, 


isa partition of S similar to that of Q in (22), equation (18) implies 
= (1,7 = 1, 
or, if Si; =g, 


(23) Pio"; =o. 


606 J. WILLIAMSON. 
The matrix o = (ors) in (23) is a matrix of m = e, rows and n = e; columns, 
On equating corresponding elements in (23) we obtain 

(24) P(Ors + Or+1,8 + Or,8+1 + P = ore, (r =I, 1, 2," 
with the understanding that oms3,¢ = = 0. If p—=+1, (24) reduces to 


(25) Or+1,8 + Or,8+1 -+- Crs1,841 = 0. 

On substituting s =n, n—1, n—2,- - -, successively in (25) we have 
and, on substituting r= m,m— 1, m— 2,- - -, successively, 

(27) = Om-1,8+2 = =Om-r,rs041 = 0, (r= 1, m; s = 1, 2,°--,n). 


We easily deduce from (25), (26) and (27), 


Lemma 5. If o is a matrix satisfying (23) and, if e, ~ e;, the last row 
and the last column of o are zero. If ee; =n, then ors =0, when 
r+s>n-+1 and 
(28) (— 1)" ein. 

If p is of order 2, equations (25) may be solved to give a particular 
matrix S*, which satisfies (22) and whose elements are two rowed scalar 
matrices. Any other matrix S, which satisfies (22) is, by Lemma 1, of the 
form MS*, where MQ = QM. Since the elements of M are polynomials in p, 
so are the elements of S. Hence 


(29) Porsp’ = Ors, 
since p is orthogonal. Accordingly, Lemma 5 is also true, when p is a two- 
rowed matrix. 

Let = > and let sj; denote the element in the first 
column and the last row of S;;. Then, by Lemma 5, S,, is singular, if and 
only if s,, is zero. If Sj; is non-singular, 1 << jc, we may interchange 
S;; and S,, without disturbing Q. If Sj; is singular for all values of j, 
1=j)j Sc, then 
(30) 83; = 0, (j =1,2,---,¢). 


Since § is non-singular and since, by Lemma 5, the last row of Six is ze", 
when k > c, for at least one value of j,1 << jc, 8,;40. We may therefor 
suppose, without any loss of generality, that s,. + 0. 

Let J be the unit matrix of order e, + e,-+-+ +--+ e; and H, the matrit 


ie . Then H, is commutative with Q and 
1 


i 
| 
| 
[ 
| 
i 
i} 
i 
i} 
i 
i 


Ns, 


to 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 607 


H,SH’, = R = (Ri;) =1,2,:--,t), 
where 


Ria Sir + S12 + + 


The element in the last row and first column of F,, is 


111 = S11 + S12 + S21 + Soo = Si2 + Say by (30). 


If p is a two-rowed matrix the transformation by the matrix 


is admissible and 


HSH’, = F = (Fi;), (1,7 = 1, 
where 
fir = — S12). 
Since § is skew symmetric, S2; = — S’;, and by (28) 


Soi — (--- 1) 5. 


Hence, if e¢, is even and p is of order 1, 82; = 8’15 = 8,2 and 11, = 28,2 ~ 0. 
If p is of order 2, at least one of f,; or r;, is different from zero and accordingly 
at least one of F,, or #,, is non-singular. Therefore, unless e, is odd and 
p= + 1, 
S = L = (1i;), == 1,2,---,¢), 
where 
Si 

is non-singular. 

If e, is odd and p= +1, =— = 0 and S,, is singular. Hence 
c=2 and we may suppose that s,.540. Then the matrix 


(Si;) (1,7 = 1,2), 


is non-singular, since 
|S, | = + (S12), 


as is seen by re-arranging the rows and columns of S, in the order 1, e; + 1, 
2, €: + 2, etc. By repeated applications of Lemma 3 we therefore deduce that 


(31) [S,, °°, Sx]. 


The component matrices S; on the right of (31) are of two distinct types: 


n), 
_| 
n), 
ow 
en 
lar 
lar 
‘he 
rst 
nd 
ge 
0, 
Te 
ix 


608 J. WILLIAMSON. 


Type a. The matrix S; is of order 2e;,, p—=+1, e; is odd, and 
Pj) Sj[Pi, Pi)’ = Si. 

Type b. The matrix S; is of order e; and PjS;P’; = Sj. 

Reduction of type a. For convenience we drop the suffix 7 and write 


[P, P]S[P, P]’ =8S, where S = (Sre), (7,8 == 1,2). 
Hence 


As a consequence of Lemma 1, 
(32) Sire M,.X, 


where M,s = M,,(U) is a polynomial in U = U; and X is a particular solution 
of PXP’ =X. Since S,, is singular and X is non-singular, 


(33) M,r(U) =Um,-(U) ; (r = 1,2). 


If 
B12 and = Bu 0 


o, is non-singular and 
0 
4 2 on 
0 


The matrix o.0,-' is commutative with [P,P] and therefore so is the matrix 
H = E — 3o.0,". Further 


— + T25 


where 7; = 0; — }o20,"'o, and tz = 4020,"'020,"'02. As a consequence of (32), 


(33), and (34) 
| K,,X, K2X |, 
where K,, and Ke» are polynomials in U each with a factor U*, while 7; is 
of the same nature as o, and is non-singular. We may therefore repeat this 
process of reduction with o; replaced by 7; and, since U* =0, in at most 
(e; + 1)/2 steps reduce S to the form 
0 
ees S 12 0 
Let H be the matrix H = [ Fj, (S8’;2)*]._ Then 
0 
HSH’ 
—H; 0 
and 


H[P, P|H’ = [P, = [P, (1”)*]. 


We have therefore 


| 


nd 


On 


ix 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 609 


cesult IJ. In type a the matrix [Pj, P;] may be replaced by 
4; = (P's). 
Then Sj ~ Gj. 
teduction of type b. Again we drop the suffix 7 and write P = pH + pU, 
where P is of order e. If T is the counter unit matrix of order e, it is a 
consequence of Lemma 5 that 
(37) S= (09 +01 +° ‘+ o¢-1)T, 
where the elements of o, are all zero except in the k-th diagonal above the 
leading one. If oj, is the non-zero element in the j-th row of ox, a simple 
calculation shows that 
(38) Oj = Sk,e+1-j-ke 
The matrix U"o; is of the same type as ox,; and, in particular, the elements of 
(39) Pk — U*o, 
are all zero except those in the k-th diagonal above the leading one. The non- 


zero element in the j-th row of px is 


(40) Pik = Oj+k,o = 
If 
Hy, = FE + qu*, where qp = pq, 
then 
(41) = C = (Cre), (r,s = 1,2,- e). 


Since TU’ = UT, 


C = (E+ qU*)STT (BE + 0") 
= (E+ qU*) (00 +01+° (E+ 7¢U")T 


where 
(42) Vi (f = 0, 1, -,k—1), 
and 
= on + U¥ goo + oog UF. 
Since, o> — — Uap, this last equation becomes 


yn = on + (¢ + (—1)*q’) 
The non-zero element in the j-th row of +, is (cf. 38) 


= jn + + (—1)"9') 


610 J. WILLIAMSON. 


Hence by (38) and (40), 
(43) Cj,e41-j-k = 8j,e+1-j-k + ( + (—1)*q’) 


Type b,. e=2m. Let +1—2j and g = — 85; Then, 
since sj; is skew symmetric and S¢,;-;,; is symmetric, gq is skew symmetric and 
as a consequence of (43), cj; = 0. 


Since, by (25), 853-1 + 8j-1,5 + =0, if 85; = 0, 85,541 8j4,; and 
accordingly s;,;-. is symmetric. Therefore, when s;; —0, if k=e-+2—2 
and ¢ = — it is a consequence of (43) that cj,;.—0. 


Hence it is possible by an admissible transformation to reduce § to a form, 
in which sj; = 8;,j-1 = 8j1,; =0. Equations (42) show that such a trans- 
formation does not alter the value of ss, when r+ s > 2j. Therefore by 


giving j successively the values m,m—1,---,2,1 we deduce that S = D, 
where 
(44) dy» do» dos dm,m-1 — dmm == (), 


Equations (44) and (25) together imply that 
(45) dre = 0, (r,s == 1,2,- +, m). 


The non-zero elements of D are now determined by means of equations (25) 
in the form 


where Zz is unique. Hence 


S=dX 


where X is uniquely determined. Since d = dy¢ is symmetric, d is a scalar. 
Therefore the admissible transformation of matrix E/W/d reduces dX to the 
form eX, where e = + 1, so that 


i, 


As a consequence of (45), we have 


0 X12 
(46) X= 0 ). 


where X,, is a square matrix of order m = e/2. For example if m = 4, 


1 3 31 
—1—2—10 

1 1 oo; 
—1 0 00 


= 


| 
ii 
| 
| | 
| 
ii 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 611 


where, in case p is a two-rowed matrix, each integer denotes the corresponding 
scalar matrix.° We have therefore 


result III. Intype b, when e; = 2m;, Sj ~ «Xj, where e = + 1 and X; 
is uniquely determined. 

Type bz. e=2m-+1. The matrix p is necessarily a two-rowed matrix 
and each element s,s of S is of the form s,s = drs + ibrs. Further, since 


se. = 1)?"Si¢ 


Se, 1s Skew symmetric and, as a consequence of (28), srs is skew symmetric, 
when r-+s—~—e+1. If k=—1, and g =— a8 a Con- 
sequence of (43), 


Cm,m+1 == Sm,m+1 — Am,m+1 = 


Therefore we may SUPPOSe Sm,m41 to be skew symmetric. By a process analogous 
to that adopted for the case e = 2m it may be shown that 


where «== + 1 and Y is uniquely determined. In particular 


(47) Yrs = 0, (r,s = 1, “, 
and 
(48) Yrs = 0, r+s>e+2. 


For example, if e = 5, 


ro 0 i/2 31/2 i) 

Ye | #/2—i/72 ¢ 0 
3/72 —i 0 0 0 
i 0 


We have therefore 

Result IV. In type b2 when e; = 2m; + 1, Sj; = eY; where «= + 1 and 
Y; is uniquely determined. 

By combining results I, II, III, and IV it is possible to determine a 
normal form for S under admissible transformations. This normal form is not 
completely determined by the elementary divisors of Q —AE or of A—AE. 
With each elementary divisor of the form (A +1)** and with each patr of 
conjugate elementary divisors of the form (A—a+1b)*, a? +b? =1, is 
associated a positive or negative sign. 


°Cf. Turnbull and Aitken, Canonical Matrices, pp. 155-159. 


nd 
nd 
2 
m, 
18- 
by 
D, 
le 


612 J. WILLIAMSON. 


Before proceeding to show that the elementary divisors together with the 
signs attached to them completely determine the normal form for S, we deduce 


THEOREM 3. If A is a canonical matrix the determinant of A has the 
value 1. 


This is an immediate consequence of the fact that the determinant of 4 
is the product of the latent roots of A and that the latent root — 1 must occur 
an even number of times (result IT). 


5. Necessary conditions. Let A, and A, be two canonical matrices, 
which are similar under a canonical transformation. Then, by Theorem J, 
the associated skew-symmetric matrices S, and S, are equivalent under an 
admissible transformation. The matrices S, and S. may be taken in the 
normal form of the previous section and are accordingly diagonal block 
matrices, whose component block matrices differ at most in sign. If, with the 
notation of (20), 

Q) Qx |, S; = ox | and S> = [71, ti], 
there then exist k non-singular matrices Wj; such that 

W jo; W’; and Q;W;, Gj = 1, 2, k). 
We need, therefore, consider only equations of the type 


where is defined by (13). If o= (oij), r= (rij) and W = (Wi)), 


=1,2,- - -,t), we have the equations 
. . 
(49) WearapW'ip = (4,7 = 1,2,---, #8). 
a=1 = 


If oi; = Ki;T; and 14; = FijT;, equation (49) becomes 


t 
> WiaKapT pW’ jp = FijTj, 
a=1 B=1 
or, by Lemma 4, 


> W iak agW jpT = (1,j7 = 1, 
a=1 f=1 

It is a consequence of the nature of the matrices W;;, Kij;, etc., that this last 


equation implies 


t t 
(50) > Dd Wiakagwig = fis, (1, j = 1, 2,° -,t), 


a=1 B=1 


| 

| 
| 

| 

i 

| 

| 


the 
duce 


the 


Cur 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 613 


where each small letter denotes the element in the first row and the first 
column of the matrix denoted by the corresponding capital letter. Since o and 
in normal form = rij 0, if eg Further wij = 0, if < 
and, by Lemma = 0, if >e;. Hence, if > = * = Ca > Car, 
we have as a result of (50) and Lemma 4, 


d d 
(51) Wiakapwip = (4,7 =c,c+1,°° 
pP=c 


If B is the matrix whose elements are wij, (1,7 =c,c+1,:°-,d), 


(50) may be written in the form 


(52) B( kis) B = (fii), (1, c,e+1, , a). 
Since | B| is a factor of | W| and W is non-singular so is B.1° The 
(d —c)-rowed square matrices (ki;) and (fij), (47 are 
therefore conjunctively equivalent. Further, as a consequence of results I-IV, 
(fi;) coincides with (ki;), unless the matrix P; is of type b. In this last case 


(kis) = [ecg, and (fis) = [ecg, 


0 1 
where ej = + = +1landg—lor ( Therefore we deduce from 


(52) that 

Hence the number of positive e; is the same as the number of positive ¢’;. We 
may call the number of positive ¢; the index of the elementary divisors (A + 1) 
or of the pair of conjugate elementary divisors (A—a + 1b). 


Hence by Theorem 1 we have 


THEOREM 4. Necessary and sufficient conditions, that two canonical 


matrices A, and Az be similar under a canonical transformation, are that 


(a) the elementary divisors of the pencil A, —AE be the same as those 
of the pencil A, — XE, and that 


(B) the indices of all elementary divisors (A + 1)** and of all pairs of 
conjugate elementary divisors (\—a = ib)*, a® + b? =1, be the same for 
both pencils. 

Cf. John Williamson, “The equivalence of non-singular pencils of hermitian 
matrices in an arbitrary field.” American Journal of Mathematics, vol. 57 (1935), 
pp. 484-485, 


12 


614 J. WILLIAMSON. 


6. Normal form of a canonical matrix. In order to determine the 
normal form, to which a canonical matrix A may be reduced by a canonical 
transformation, it is only necessary, on account of Theorem 1, to reduce the 
associated skew symmetric matrix S of section 4 to the form @ by a matrix R, 
and then to determine RQR. As a first step we reduce each S; of type b to 
the form G;. 


Type b,. Since ej = 2m, we may write 


Pm Ln 
Pay Pin (7, 


where Py, is of order m and is defined by (12), while all elements of Dm are 
zero except the element in the last row and first column which has the value p. 


By result III 
0 


where C is a non-singular matrix of order m. Since 


it is easily verified that 


EL 0 
Let R -( . Then 
0 En 
0 )- Gm; 
and 
where 
0 0- 0 
0 0: 0 
(55) Mm = 0 


Accordingly we have 


Result IIIg. In type b,, when e; = 2m, Pe, may be replaced by 


eMm 


where Mm is defined by (55) ande=+1. Then Sj = Gom. 


For example, if m = 3, and e 1, Pom may be replaced by 


| 7 
| 
| 4 
| 
IH 
i 


the 
ical 


the 


R, 
to 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 615 


rppo9 oO 00) 
Opp 90 00 
p—pp 
000 p O00 
000—p pd 


If p=1, this last matrix is a canonical matrix of order six with the single 
elementary divisor (A—1)®. 


Type be. Since ej =2m-+1 we may write 


where the only non-zero element of Im is an element p in the last row and first 
column. By result IV 


and 


0K 
0); 


where D is a non-singular (m + 1)-rowed matrix, while K consists of the first 
m rows of —D’. As in the previous case we deduce that 


(56) — D. 
Let B— Then 
En 0 0 K\(E 0 
5 ? m 


0 eK Bu 0 
| 0 0 —e(D’)* 


Since K is formed of the first m rows of — D’, —K(D’) = (Em,0). Further 
the first m elements of the last row of the product on the right of (57) are 
zero, while the last m + 1 are the elements in the first row of «(D~*)’. The 
only element in the last column of D- different from zero is the last, which 
has the value (—1)”-1(7). Therefore the only element different from zero 
in the first row of (D-*)’ is the last, which has the value (— 1)”-1(1)’= (—1)™2. 
Accordingly it follows from (57) that 


0 Em 0 


(58) 2. 0 
0 0 (—1)™e 


are 
Pm Lm 
fm 
0 Pings 
«¥, 
|| 


616 J. WILLIAMSON. 


But by (56) 


Pas eNn 
(59) RI omit 0 


where the last row of Nm is p times the first row of — D, that is 


( 0 0 0 
0 0) 0 
ip ip ‘ ( 1)™-1 
= 2 


It is not possible to proceed any further with the reduction without breaking 
up some of the two-rowed matrices into their component elements. Accordingly 
we write Nm = Km&,%2, where a, and a are matrices of a single column, and 
(P’ms1)7 in the form 
(P’m)* 
= 8; a b 
—ba 


where y;, y2 are matrices of a single column and 4,, 8, matrices of a single row. 
Then the matrix F in (59) becomes 


Pin eK m €%, 
0 
0 8; a b 
0 & —b a 


If «e—(—1)™, so that «(—1)”i =i, a simple interchange of rows and 
columns reduces the matrix on the right of (58) to 


0 0 
0 0 0 Ii 
=¢ 
i 
0 —1 0 0 
and F to 
EK», 
0 a b 
(61) Ze, = 


0 (Pm) Y2 
0—b & a 


01\/0 —1\/01 01 
On the other hand, if = (— 1)”"’, since 


| 
( 
| 
| 
| 
| 


ing 


gly 
nd 


LINEAR CANONICAL TRANSFORMATIONS IN DYNAMICS. 617 


the matrix on the right of (58) may be reduced to G,, and F to a matrix 
obtained from (61) by interchanging the subscripts 1 and 2 and b with — b. 


We therefore have 


vesult In type be, when ej = 2m + 1, Poms may be replaced by one 
of the forms (61). Then S; = Ge,. 


By the above processes we may reduce S to the form [G,, G2,- - -, Gx], 
where Gj -(_%. and to the form where Z; is 
determined from one of the results I, II, III], and IVa. Let 

Zij,11 Zj,12 


where Zj,rs is a square matrix of the same order as H;. Then by a simple 
interchange of rows and the same interchange of columns, the matrix 
Go,- -, Ge], may be reduced to G and at the same time Z;, Z2,° - Zz to 


Ais 
4): 
The matrices A, in (62) are defined by 


(63) Ars = (r,s = 1, 2). 


The matrices (62) are uniquely determined, apart from a rearrangement of 
rows and the same rearrangement of the columns, by the elementary divisors 
of A—AE and the indices of these elementary divisors. Therefore we have 


TuEorEM 5. Any canonical matrix is similar under a canonical trans- 
formation to one and (essentially) only one of the matrices (62). 


THE JOHNS HOPKINS UNIVERSITY. 


CRITERIA FOR CERTAIN HIGHER CONGRUENCES.* 


By LEONARD CARLITZ. 


1. Introduction. The congruences in question are of the form 
8 8 pra-v 
(1.1) = M (mod P), 
i=0 


where M and P are polynomials in an indeterminate x with coefficients in the 
Galois field GF'(p") of order p", and P is irreducible. As for the coefficients 
in the left member of (1.1), if we put 


FP, Fo—1, 


then we define 


Thus the polynomial in w occurring in (1.1) closely resembles the polynomial’ 


(1.5) yo(u) — (—1y*[ 

which has the characteristic property 

(1. 6) ve(u) If (u—F), 
degE<s 


the product extending over all polynomials EH (including 0) of degree < 5. 
A closer connection will appear below. 
If now we put 
M = A*"“” (mod P), 


as may always be done, we shall derive the following criterion * for the con- 
gruence (1.1): Let 


* Received December 11, 1936. 

1See Duke Mathematical Journal, vol. 1 (1936), pp. 139-142; this paper will be 
cited as DJ. For the congruence Y,(u) = M, see Bulletin of the American Mathematical 
Society, vol. 41 (1935), pp. 907-914. 

* For the case s = 1, see DJ, pp. 164-168. 


618 


if 
ie 
ii 


al 


CRITERIA FOR CERTAIN HIGHER CONGRUENCES. 619 
1.7) (c; in GF(p")), 
( P’ = ka" + (k—1)e,a%? +- 


so that P’ is the (formal) derivative of P. Assume k > s: then the congruence 
(1.1) ts solvable if and only if the product AP’ is congruent (mod P) to a 
polynomial of degree < k—s. If this condition is satisfied the congruence 
has precisely p”* solutions. 


2. The polynomials g,(u) and f.,(u). We denote by gs(u) the poly- 
nomial in the left member of (1.1). Since, by (1.2), 


[s] = [s—+] + 
it is evident from (1.3) and (1.4) that 


[s]Fe-s re 


8-i-1 


so that 


for 0 << s—iSs: by properly defining our symbols we may assert that (2. 1) 
holds also for s—i—0,s. Then, by substituting in the left member of (1.1), 
we have 


4-0 


from which it follows that ® 
(2.2) = (u) — ge" (Fi? ur") = gh" (u— ur"). 


If we define g.(u) =u, it is clear that (2.2) holds for all s=1. 
We next define the polynomial f,(w) by means of 


nt 
i=0 


where k is some fixed integer > 0. Then we have 

*The quantity Ff? may be defined in terms of the symbol wo", Otherwise the 
formula (2. 2) may a. interpreted as a congruence (mod P), in which case no new 
symbol is required; this interpretation is sufficient for the application. 


), 
al} 
n- 


LEONARD CARLITZ. 


i=0 4=0 PsP 4? 
p” 
uw", 


from which follows the formula 
(2. 4) fa( ur") — Fe" = fo" (uw), 
fors=2. Now by (2.3), 


k-1 k-1 


4-0 i-0 
Thus if we define f,)(w) by means of 


(2. 5) fo(u) =u—w™, 


it is evident that (2.4) holds for all s=1. 
Making use of the formulas (2.2) and (2.4), we now prove the identical 
congruence 


(2.6) fa(ge(u)) =u—w™ (mod P), 


where P is irreducible of degree k. 
For s = 0, (2.6) follows at once from (2.5) and go(u) =u. For s=1, 


gi(u) =u—u™", fi(u) filgi(u)) =u—w", 


so that (2.6) holds in this case also. We now assume that (2.6) holds up to 


and including the value s—1. Clearly * 


is uniquely determined (mod P). Then, by (2.2) and (2.4), we have for 
1, 


(u— Fie") } 
(2.8) = Few") — 


“Compare footnote 3. 


620 
| 
| 
| 
| 


CRITERIA FOR CERTAIN HIGHER CONGRUENCES. 621 


since (2.6) is assumed to hold for s—1. We rewrite (2.8) in the form 


8-1 8-1 


or 
8-1 


If now we compare this with the left member of (2.7) and replace wu?" by u, 


we get 


fs(gs(u)) =cFs. + u— w™ (mod P), 


where c is in GF'(p") ; but from the form of f,(w) and gs(u) it is clear that 
c=0, so that (2.5) holds for the value s. 
In a similar way we may prove the identical congruence 


(2.9) gs (fa(u)) =u—w"™ (mod P), 
which obviously holds for s = 0,1. Indeed to prove (2.9) note that 
Go (fa(ur")) == — } 
= 9?" {fos(u)} = (u—w")™ (mod P), 
which completes the induction. From (2.6) and (2.9) follows 
THEOREM 1. For P irreducible of degree k, and 0SsSk, 
= = ge(fa(u)) (mod P). 


It is of some interest to observe that (2.6) and (2.9) are equivalent, 
that is, either implies the other (without the use of formulas (2.2) and (2.4) ). 


This is a consequence of the following 
Lemma. I[f f(u) =Saju, g(u) and 
f(g(u)) =u—w™ (mod P), 
where P is irreducible of degree k, then also 
g(f(u)) =u— w™ (mod P). 
Since the proof is very much like the proof of a similar theorem * proved 
elsewhere, it will be omitted here. 


3. Factorization of f,(u). From the identity (1.6) it follows readily 
that 


(3. 1) = (— 1)* IT (1 u/L), 


deg E<k-s 


p. 152. 


622 LEONARD CARLITZ. 


the product extending over all H (except 0) of degree < k—=s. On the other 
hand, by (1.5) and the first of (3.1), 


+8-1) 


(8-1) (8-1) (+8-)) 


k-8-j 


Now, from (1.3), it is easily seen that 


ie) = (—1)#* (mod P), 
z= (— 1) 
so that 
on?” n(s—1 F + n(j+s-1 
and therefore, 
(3. 2) ( (mod P). 
1 
Comparison of (3.2) and (3.1) leads to the factorization 
1) 
(3. 3) 1— | (mod P). 


But © Ly_-1 = (— 1)*"P’, where P’ is defined by (1.7) ; therefore we may put 
(3.3) in the form 


(3. 4) fe(u)=u II 


deg E<k-s 


uP’ 


THEOREM 2. The polynomial f.(u) factors completely (mod P) ; the 
roots are (E/P’)"°™, where E ranges over the p"*-®) polynomials of degree 
< k—s, and P’ is the derivative of P. 


4, Criteria for solvability. If the congruence 
(4. 1) gs(u) =M (mod P) 
is assumed solvable, it follows from (2.6) that 
(4. 2) fe(M) = fs(gs(u)) =u— w™ (mod P), 


for P irreducible of degree k. But by Fermat’s Theorem, if A is any quantity 
(mod P), A?" = A, so that (4.2) implies 


(4. 3) fs(M) =0 (mod P); 


*DJ, p. 166. 


1 
( 
( 
| 


her 


ut 


CRITERIA FOR CERTAIN HIGHER CONGRUENCES. 623 


that is, (4.3) is a necessary condition that (4.1) be solvable. To show that 
this condition is also sufficient, we make use of Theorem 2 and formula (4.2). 
By Theorem 2 we have the factorization 


(4. 4) fa(go(u)) am O (go(u) —8) (mod P), 


where 8 ranges over the roots of f,(8) == 0, and C is independent of wu. If, 
now, we compare (4.4) with (4.2) and recall that w’* — u factors completely 
into linear factors, it is clear that for all 8 (satisfying the congruence f,(5) == 0) 
the congruence g.(u) =8 is solvable; further, since gs(w) —8 divides w™ — u 
it follows that the congruence in question has the maximum number of solu- 
tions. We may now state 


THEOREM 3. The congruence g.(u) =M (mod P), where P ts irreducible 
of degree k > s, is solvable tf and only if fe(M) =0 (mod P). If this con- 
dition is satisfied, the congruence has precisely p"* solutions. 


Now by Theorem 2, the roots of f.(w) == 0 are the quantities (Z/P’)”"“™, 
where F is of degree < k—s. Thus it is necessary that M be congruent to 
one of these quantities. If then we replace M by A?"“” (clearly M uniquely 
determines A), we have the 


THEOREM 4. If P is irreducible of degree k > s, the congruence 
(4. 5) gs(u) = A*"*” (mod P) 


is solvable if and only if the product AP’ is congruent (mod P) to a poly- 
nomial of degree << k—s. If uy ts a particular solution of (4.5), then the 
general solution is Uo + p, where p ranges over the p"* roots of gs(u)==0 (mod P). 


The second part of the theorem follows at once from the observation that 
ge(u) == and gs(v) imply gs(u—v) =0. We shall now 
determine the roots of gs(u) =0. 


5. The roots of g,(u) =0. For s—1, g:(u) =u— ww", and the roots 
are evidently the p” elements of the GF(p"). 

For s = 2, we make use of the recurrence (2.2). Thus we have the 
condition 
(5.1) =9."(u— =0. 


Therefore, by the preceding paragraph, 


u—F =c (c in GF(p")), 


624 LEONARD CARLITZ. 


so that 
— =c(a" — 2), 


and, therefore, we have at once Fw" = cz + c’, where c and c’ are arbitrary 
elements of GF(p"). Thus by (5.1) the roots of g2(u) =0 (mod P) are 
furnished by (ca + ¢’)/F;. 

For the case s = 3, we again employ (2. 2) 


(5. 2) g3(w") = 92?" (u— F,'? =0; 
then, as above, we get 


ca+c¢ 
PF 
Fu” — = (ca + c’) 2), 
from which follows easily 
= + + 2) +c”, 


where c, c’, c” are in GF(p") ; thus by (5.2) the roots of g3(u) = 0 (mod P) 
are furnished by 


+ + 7) + 


where c, c’, c’” independently range over the elements of GF(p"). 
It is now not difficult to determine the roots of gs(u) =0. Let 


(5. 3) oj = oj == - 


denote the j-th elementary symmetric function of the quantities x, 2?",---, x? 
thus we have the identity 


(5.4) (tx) (tf — 


j=90 
We shall prove 


THEOREM 5. The p"* roots of gs(u) =0 (mod P) are furnished by 


(5. 5) 
5 : 


where the c; independently range over the elements of GF(p"), and oj‘ 1s 
defined by (5.3). 


The theorem is evidently true for s = 1, 2,3. Assuming it to hold up to 
and including the value s—1, we use (2. 2) 


Th 


cor 


Bu 


we 


th 


ple 
. 

n(s-1) , = 

is 

in 

fo 
| 


CRITERIA FOR CERTAIN HIGHER CONGRUENCES. 625 
p" — n n 
gs ue") = 0. 
Thus since the theorem is assumed true for the case s —1, we have at once 


It is clear that, to complete the induction, it is only necessary to show that 
From (5.4), it follows that 
8 
a") (¢+ (t+ 20") => (0; ) 
j=0 


combining this with (5.4) we have 


j=0 
But this implies 


in this formula replace s by s—1 and (5.6) follows immediately. This com- 


pletes the proof of the theorem. 
As an immediate corollary of Theorem 5 and the latter part of Theorem 4, 


we state 


THEOREM 6. Jf uy is a particular solution of gs(u) =A“ mod P), 
then the general solution is Uy +p, where p is determined by (5.5). 


6. Some extensions. If f,(A%"“”) =0 (mod P), it is clear from (2. 4) 
that also fs_,(A®"°”) = 0; however, the converse is not true in general. Thus 


it may happen that the congruence 


gs(u) = (mod P) 


is not solvable, while the congruence 


Js-1(u) = (mod P) 


is solvable. In this case it is easily seen that g,(u) — A?“ breaks up (mod P) 
into a product of p"‘*-.) factors each of degree p". This follows from the 


formula 
(6.1) gs(u) — AMO == — — Ars", 


re 


626 LEONARD CARLITZ. 


which is an immediate consequence of (2.2). Assume fs_,(A?""”) =0, so 
that by Theorem 3 we have the factorization 


Go-s(u) — AP == (— I} (u—8) (mod P), 


where ranges over the roots of gs1(u) Substitution in 
(6.1) gives 


(6.2) gs(u) — == (—1)* 


II — u + 8") (mod P). 


More generally if we assume only f,(A®"“”) =0 (mod P), where r <s, 
then we may show that g.(u) — A?"“” factors (mod P) into a product of p" 
polynomials, each of degree p"“*-", Thus formula (6.2) is the special case 
r==s—l1. We now show that in the general case we have a factorization 
of the form 


(6. 3) (Ut) 
F, ns 
where G,,,(u) is a linear’ polynomial of degree p"*, and the product extends 
over the roots of g-(B) = 
Let r be fixed; the formula (6.3) is obviously true for s=0. According 


to (6.2), the formula holds for s1. Assume that (6.3) holds up to and 
including the value s—1. Then by (6.1) we have 


Fr 1 ns 
(Fy (67, — — (— 


thus completing the induction. It is also evident from the above that the 
polynomial G,..(u) satisfies the recurrence 


(6. 4) G,.(u) = Gr o(u) =u. 


r+8-1 


We have therefore proved 


THEOREM’. If f,(A?””)==0 (mod P), the polynomial gr.,(u)— 
has the factorization (6.3), where G,,.(u) is determined by (6.4). 


7 That is, of the form 


sin 


the 
duci 
whe 
(6. 
In 1 
(6. 
= 
Cle 
Us 
div 
(6. 
fie 
dit 
tri 
Js 
th 
fa 
pe 
n 

a 


CRITERIA FOR CERTAIN HIGHER CONGRUENCES. 627 


We shall now show that the polynomial G,,,(w) — (— 1)*B*" occurring in 
the right member of (6.3) can in general be factored further (mod P) ; irre- 
ducibility occurs only in the case n = 1 =s, #0 (mod P). 

It is convenient to deal with the left member of (6.3). Let 


(6.5) h(u) =g.(u) —-M, 
where M is arbitrary (mod P). Then by (2.6), 

(6.6) fe{h(w)} =fe{ge(u) —M} =u—w™ —f,(M) (mod P). 
In the next place, 


(6.7) fa{h(u)} — {h(u)} = u— + — {f,(M) — fr (M)} 
= u — + (mod P), 


since for arbitrary M, M?*==M (mod P). Now put 


=u—w", 
Us = Uy, — am 4 — ue 


Up = Up-1 — = u— ur”, 


Clearly w;,, is a multiple of uj (j=1,--+:,p—1). Thus it follows that 
u. divides up. Therefore by (6.6) and (6.7), the polynomial h(u) is a 
divisor of up: 

(6. 8) h(u)|u— ur? 


Now on the other hand since the set of residues (mod P) form a finite 
field GF'(p"*), it follows from a well known theorem that all the irreducible 
divisors (mod P) of wu», are of degree 1 or p. Therefore by (6.8) the same is 
true of h(w). On the other hand it is evident from Theorem 6 that if 
gs(u) —M has one linear factor, then it has p"* linear factors. This proves 


the following 


THEOREM 8. For arbitrary M, the polynomial g,(w) —M (mod P) either 
factors completely into linear factors, or else is a product of p"*-» irreducible 
polynomials each of degree p. 


By Theorem 5, if f-(M) #0 (mod P), the polynomial g,(w) — M has 
no linear factor. Hence by the preceding theorem we have 


THeoreM 9. Jf f,(M) 40 (mod P), then the polynomial g,(u) — M is 
a product (mod P) of p"°*-? irreducible factors each of degree p. 


628 LEONARD CARLITZ. 


Suppose now in (6.5) we put M== Ar". Then comparison with (6.3) 
leads at once to 


THEOREM 10. Let f,(A?"””) =0, frs:(A?"") 40 (mod P) ; let B be a 
root of gr(B) =A”. Then the polynomial G,.(u) — (—1)*B™ occurring 
in the right member of (6.3) is a product (mod P) of p"‘*-» irreducible poly- 
nomials each of degree p. 


In particular for n = 1, s = 1, we get 


THEOREM 11. If the hypotheses of Theorem 10 hold, and in addition 
n =s = 1, then the polynomial 


Gir (1) + —u+ pp 
is irreducible (mod P). 
It is not difficult to prove this theorem directly. 


DUKE UNIVERSITY, 
DURHAM, NorTH CAROLINA. 


vo 


con 
(1) 
(2) 
wh 
anc 
| 
the 
ain 
his 
Ric 
re] 
of 
nu 
W: 
Th 
aly 
th 
ob 
ap 
th 
se 


ON A TRIGONOMETRICAL SERIES OF RIEMANN.* 


By AurEL WINTNER. 


In his paper on Riemann integrals and trigonometrical series, Riemann ? 


considers the series 


k 

and 
(2) 3 sin 
where 

—3 if eA [2] 
(3) 0 if «= ([z] 
and 
(4) (—1)4 


d\n 


the summation in (4) being extended over the d(n) divisors d of n. Riemann’s 
aim in considering the series (1), (2) is to illustrate the limitations to which 
his definition of an integral subjects the theory of Fourier series. In fact, 
Riemann observes that if x is rational, both series (1), (2) are convergent and 
represent the same value, while the function defined by these series on the set 
of rational numbers is a non-bounded function on the set of those rational 
numbers which are contained in any fixed interval. 

This statement of Riemann has recently been verified by Chowla and 
Walfisz ? who discussed the series (1) and (2) for irrational values z as well. 
They proved, among other things, that the trigonometrical series (2) is 
almost everywhere convergent and represents almost everywhere the sum of 
the series (1). That the series (1) is almost everywhere convergent, is an 
obvious consequence of Khintchine’s metrical results concerning diophantine 
approximations. 

The object of the present note is an approach to Riemann’s series from 
the point of view of Lebesgue’s theory. It will be seen that the trigonometrical 
series (2) is a Fourier series in the sense of Lebesgue and belongs to the 
* Received April 28, 1937. 

*B. Riemann, Gesammelte mathematische Werke, 2nd edition, Leipzig, 1892, p. 263. 

2S. Chowla and A. Walfisz, ‘‘ Ueber eine Riemannsche Identitit,” Acta Arithmetica, 
vol. 1 (1935), pp. 87-112. 


13 629 


= 


630 AUREL WINTNER. 


function defined by (1), i.e., that the odd periodic function (1) is integrable 


in the sense of Lebesgue and has the coefficients of (2) as Fourier constants, 
It will also be shown that not only the function (1) but also every positive 
power of it is integrable in the sense of Lebesgue, i.e., that the function (1) 
is of class L” for arbitrarily large p, although this function is non-bounded in 
every interval (also if one discards sets of measure zero). It has, perhaps, 
a historical interest that the Lebesgue theory of integration and of Fourier 
series applies without difficulty to the example by means of which Riemann 
himself illustrated the limitations of his theory. 

That (2) is almost everywhere convergent to the function (1), will turn 
out to be an immediate consequence of the fact * that if a Fourier series 


(5,) f(z) ~ & (ad cos nz + b, sin nz) 
is such that 
(52) (dn? + bn?) < + where 8>0, 


then (5,) is convergent almost everywhere to the function f(z). The ex- 

ceptional z-set of measure zero is, according to Chowla and Walfisz,? such that 

its elements x essentially depend on the arithmetical structure of the number 2. 
First, it will be shown that the series 


(6) 


m=1 m 


is convergent in the mean, i.e., that 


(7) lim f 


where f,(z) denotes the odd periodic R-integrable function 


n 
y(mz) 
m=1 m 


Since the Fourier series of the Bernoulli polynomial (3) is 


] 


(9) = =sin 
wT 


~ 


it is seen from (8) that 


8 According to Kolmogoroff and Seliverstoff, one can replace né in (5,) by logn; 
ef. A. Kolmogoroff and G. Seliverstoff, “Sur la convergence des séries de Fourier, 
Rendiconti della Reale Accademia Nazionale dei Lincei, ser. 6, vol. 3 (1926), pp. 307-310. 


whe! 


grea 


Hen 


for 4 


the ¢ 


(12) 


Sine 


(10 
|| 
Zz 
Sinc 
to sl 
Nov 
1 
joo 0 
| 
|| 
= 
38-4] 


ON A TRIGONOMETRICAL SERIES OF RIEMANN. 631 


ak 


1 1SdSn 
(10) fn(a 3 (7 >} 1) sin 


where the inner sum is extended over those divisors d of k which are not 
greater than n. Thus, for every positive integer j, 


1” < 


fnsi 3 = 1) sin 
k=1 k d\k 
Hence it is seen from Parseval’s relation that (7) is equivalent to 


1 n< d=n+j 2 
lim 1) anit, 


n—00 k=1 k dik 


lim x1) == (), 
k=n+1 k 


Since ¥1 is the number d(k) of divisors of k, it follows that one merely has 
d\k 


to show the convergence of the series 


i,e., to 


© d(k)? 
> 
Now 
© d(k)* 
11 
( ) 


is convergent for every 8 > 1, since | £(o + it)|* has the finite mean value 


for * every o > 4. 
By a well-known theorem of Fischer, the relation (7) just proved implies 
the existence of a function f(z) which belongs to the class Z? and is such that 


1 


(12) lim f {f(x) —fn(x) }? dx =0. 


« 
0 


Since the functions (8) are the partial sums of the series (6), the relation (12) 


*Cf., e.g., E. C. Titchmarsh, The zeta-function of Riemann, Cambridge, 1930, pp. 
38-41, 


Z| 


632 AUREL WINTNER. 


may be expressed by saying that the series (6) converges in the mean to a 
function f(z) of class L?. 

It is well known that (12) implies the existence of a subsequence of 
{fn(x)} such that this subsequence of {f,(a)} tends almost everywhere to 
f(z). In other words, there exists an increasing sequence {pa} of positive 
integers such that if one unites the first »,, then the next ». terms of the series 
(6), and so on, the resulting “ bracketed ” series converges almost everywhere 
to f(x). Actually, the introduction of the brackets is superfluous in view of 
Khintchine’s result referred to above. This fact will not be needed in what 
follows. For (12) in itself allows one to consider (6) as the definition of an 
odd periodic function f(x) of class L*, this function being undetermined on 
a set of measure zero. 

Since the functions (8) tend in the mean to the function (6) of class I’, 
the k-th Fourier constant of (8) tends, as n—> 0, to the k-th Fourier con- 
stant of (6) for every fixed k. Hence it is seen from (10) that the k-th 
Fourier constant of (6) is 

1 1=d=n 


n->00 ak d\k ak d\k — rk 


In other words, 


Tia k 


It follows that the function (6) is of class L? not only for p= 2 but for 
arbitrarily large p. In fact, on applying to (13) Hausdorff’s extension of the 
Fischer-Riesz theorem, it is seen that (6) is of class L” for every p if 


k=1 


is convergent for every p= 2. Thus it is sufficient to show that 


is convergent for every positive «1. Since d(k) =1, it follows that it is 
sufficient to know the convergence of 
dik)* 
for every « >0. Since (11) is convergent for every 8 > 1, the proof is 
complete. 


isc 
wit 
con 
| 
| the 
tiol 
ide} 
| whi 
tral 

1 
is 0 
Rie 
seri 

| 
(16 

k=1 t Noy 
witl 
(1) 
(17 
Fin: 


ON A TRIGONOMETRICAL SERIES OF RIEMANN. 633 


The convergence of (11) for 8 > 1 also implies that 


(AY 
v 


is convergent for sufficiently small values of 8 > 0. Hence, on comparing (13) 
with (52), it is seen from the criterion (5,) that the Fourier series (13) is 
convergent almost everywhere. Since the arithmetical means of a Fourier 
series tend almost everywhere to the function, to which it belongs, the sum of 
the Fourier series (13) is almost everywhere equal to the function (6). 

The above results concern not Riemann’s series (1), (2) but the func- 
tion (6). The transition to Riemann’s series can be based on the Bernoulli 
identity 


(14) w(x +3) —Y(2), 


which is obvious from the definition (3). 
First, since (6) is an odd periodic function of class Z?, where p is arbi- 
trarily large, the same holds for the function 


m=1 m 


It follows, therefore, from Hdélder’s inequality that the function 


m=) m m=1 m 


(15) 


| 


is of class ZL” for every p. Now (15) is, in view of (14), identical with 
Riemann’s function (1). Furthermore, it is seen from (13) that the Fourier 
series of the difference (15) is 

—1%d(k) . ~ d(k) 


(16) 


Now d(k) = 1, so that the Fourier series (16) is, in view of (4), identical 
d\k 
with Riemann’s trigonometrical series (2). Accordingly, Riemann’s function 


(1) is of class L” for every p and has the Fourier series 


r+4 
(17) = 3 (2 (—1 D sin 


din 


Finally, the Fourier series (17) converges almost everywhere to the function 


634 AUREL WINTNER. 


(1). In fact, (1) is identical with (15), while the Fourier series (13) con- 
verges almost everywhere to the function (6). 

While (¥(xz))", where (az) denotes Riemann’s function (1), is for 
every n integrable in the sense of Lebesgue, the function W(z) lies in every 
sense outside of the range of Riemann’s integration theory. In fact, if c is any 
subinterval of the interval 0 = «= 1, there cannot exist a constant M = JV, 
such that | ¥(x)|= M almost everywhere in «. For suppose, if possible, 
that there exists an M —M, for some v. Then, if A is any closed interval 
contained in the interior of 1, one can find a constant K = K) such that 
| Sn(x)| = K for every x in A and for every n, where the S,(2) denote the 
arithmetical means of the Fourier series (17) of (x). In particular, the 
S,(2) are uniformly bounded on the set of rational x contained in A. This 
clearly contradicts the fact mentioned by Riemann?’ and verified by Chowla 
and Walfisz.2 It follows, in particular, that if a function F(x) is almost 
everywhere equal to Riemann’s function (1), then F(x) is discontinuous 


at every 2. 


THE JOHNS HOPKINS UNIVERSITY. 


dep 
whi 
gen 
que: 
to 1 
vari 
treé 


as d 


is 
of 
con’ 
to 
vari 
indi 


cons 


an 


Mat! 
dure 
vol. 

| besti 
vol. 
Anne 
are 
tion, 
more 
dépe 
I; @ 
Norr 


ON DIVERGENT INFINITE CONVOLUTIONS.* 


By E. R. van KAMPEN and AUREL WINTNER. 


Introduction. It is known that the distribution theory of sums of in- 
dependent random variables can be developed from several points of view 
which are, however, in the main, equivalent. One, and possibly the most 
general, approach is represented by Kolmogoroff’s axiomatic treatment’ of 
questions of probability distribution. This approach applies, in particular, 
to the problem of probable convergence of series of independent random 
variables, as first solved by Khintchine and Kolmogoroff.? It also implies the 
treatment based on the Lebesgue measure theory of infinite product spaces, 
as developed by Steinhaus, Littlewood, Paley and Zygmund, Jessen, and others.® 

It is known * that the main result of Khintchine and Kolmogoroff, which 
is based on the notion of “ equivalent series,” can be formulated also in terms 
of infinite convolutions. The results of the present paper concern infinite 
convolutions and imply, among other things, certain facts which are equivalent 
to theorems concerning the divergence problem of series of independent random 
variables. In particular, the results imply essential refinements of certain facts 
indicated by Lévy.’ Since Lévy’s statements will not be used, the following 
considerations imply detailed proofs for them. 

Theorem 1, which applies not only to convolution sequences, seems to have 


an independent interest. Theorems 3 and 5 delimit all possibilities which can 


* Received March 31, 1937. 

1A. Kolmogoroff, ‘‘ Grundbegriffe der Wahrscheinlichkeitsrechnung,” Hrgebnisse der 
Mathematik und ihrer Grenzgebiete, vol. 2 (Berlin, 1933), no. 3. 

2A. Khintchine and A. Kolmogoroff, ‘ Ueber Konvergenz von Reihen, deren Glieder 
durch den Zufall bestimmt werden,” Recueil de la Societé Mathématique de Moscou, 
vol. 32 (1925), pp. 668-677; A. Kolmogoroff, “ Ueber die Summen durch den Zufall 
bestimmter zufilliger Gréssen,’ Mathematische Annalen, vol. 99 (1928), pp. 309-319; 
vol. 102 (1930), pp. 484-488. 

Cf. P. Lévy, ‘Sur quelques points de la théorie des probabilités dénombrables,” 
Annales de VInstitut Henri Poincaré, vol. 6 (1936), pp. 153-184, where further references 
are given. 

4B. Jessen and A. Wintner, “ Distribution functions and the Riemann zeta func- 
tion,” Transactions of the American Mathematical Society, vol. 38 (1935), pp. 48-88, 
more particularly 84-86. 

'P. Lévy, “Sur les séries dont les termes sont des variables eventuelles in- 
dépendentes,” Studia Mathematica, vol. 3 (1931), pp. 119-155, more particularly chap. 
I; ef. also the corrections on p. 337 of vol. 3 (1934) of the Annali della R. Scuola 
Yormale Superiore di Pisa. 

635 


636 E. R. VAN KAMPEN AND AUREL WINTNER. 


actually occur in case of a divergent infinite convolution. Theorem 9 describes 
what can happen to a divergent or not absolutely convergent infinite con- 
volution upon a reordering of its “ factors.” Due to the correspondence alluded 
to above,* a part of Theorem 3 can be interpreted as a manifestation of the 
famous “(0 or 1” principle. The total content of Theorem 3, which concerns 
infinite convolutions, cannot conveniently be formulated in terms of the 
probable convergence or divergence of sums of independent random variables, 
or in terms of a Lebesgue measure of an infinite product space. 

All distribution problems under consideration will be assumed to be one- 
dimensional, so that the random variables are real numbers. 


Metric. In what follows, Greek letters ¢,p,- - - will denote monotone 
non-decreasing functions of a real variable 2 which remain bounded as 
oo. The case of a constant function i.e., the case where =a 


for every x and for some real number @, is not excluded. 


(1) For a given function y—¢(2z), the symbol |¢[ will denote the 
bounded open interval ¢(— 0) <y< (+ @) or the point y = ¢(— 
according as ¢(— 0) ¢(+ or ¢(—«) —¢(+ i.e., according 
as ¢(2) is not or is a constant function. 

It will be convenient to consider a function y = ¢(z) as a Jordan curve 
in an (2,y)-plane. This is made possible by adjoining the point 
(x,y) = (2, ¢(x)) to the segment constituted by the set of points (2, y), 
where y describes the closed interval ¢(2—0) Sy ¢(2+0), if is a 
discontinuity point of ¢. Thus two functions, @ and yw, determine the same 
Jordan curve if and only if the two functions are equal at their continuity 
points, i.e., if + 0) 0) and/or ¢(a— 0) = y(x— 0) for every 
x. In this case the functions ¢ and y will be considered as identical. Corre- 
spondingly, a sequence of functions ¢,(x) is said to be convergent if there 
exists a function @(a) such that ¢:(x) > ¢(«) holds at every continuity point 
xz of ¢(x). The signs = and — will only be used in the sense just defined. 
By a classical theorem of Helly, ¢, ~ ¢ whenever ¢,(2) > (2) holds for a 
dense set of values x. 

For a given number e > 0, let the e-strip about the Jordan curve y = $(z) 
be defined as the set of those points of the (2, y)-plane whose distance from 


at least one point of the Jordan curve is less than e. 


(II) For two functions y = ¢,(r), y = let | $13 | oF 
| pi (x) ; $2(x)| denote the greatest lower bound of those e > 0 for which every 


®° A. Kolmogoroff, loc. cit. +, pp. 60-61. 


poi! 
Jor 


and 


It 
(1; 
(1. 
(1; 
Th 
if t 
ac 
(2 
On 
| tio 
Pn 
sit 
(3; 
| are 
cor 
$2 
the 
| nu 
(4 
hol 


ON DIVERGENT INFINITE CONVOLUTIONS. 637 


point of the Jordan curve y= ¢,(z) is contained in the e-strip about the 


Jordan curve y = ¢2(2@). 
Thus | ¢,; ¢.| is a non-negative number not less than 


Max (| $:(-+ —¢2(+ | $1(— ©) —¢2(— &)]|) 
and not greater than 
Max (¢:(+ —¢:(— ©), $2(+ ©) —¢2(— 


It is easily verified that 


(1,) | di;¢2| if and only if 
(12) | p13 | =| 23 |; 
(13) | $15 b2 | S| | + | $55 |. 


This means that (II) defines a metrization of the space of all functions ¢(z), 
if the sign ¢, = ¢, is defined as above.’ This metrization is easily seen to be 


a complete metrization, in the sense that 


n=O 


(2) lim | ¢m |= 0 if and only if lim | for a ¢. 
m=00 

On the other hand, it is not true that convergence with reference to the metriza- 
tion defined by (II) is equivalent to convergence represented above by the 
symbol ¢n > ¢. This is shown by the example ¢n(z) = sgn (a where 
gn > does, but | dn3¢|—-0 does not, hold for (2) =1. The general 
situation is easily seen to be this: 

(III) The three conditions 
(3:) dn (32) gn(+ 0) > 0); (3s) on(— ©) > 
are necessary and sufficient for | dn; ¢|— 0. 

The function set | ¢, ||. Two functions ¢,(2), ¢2(x) will be said to be 
congruent if there exists a number c such that the two functions ¢,(z), 


c) are identical. 


(IV) For a given sequence {¢,} of functions ¢,(2), let || dn || denote 
the set of those functions p(a) for which one can choose a sequence {Cn} of 
numbers c, such that 


(4) — Cn) — p(z), 


holds at every continuity point z of p. 


*For another metrization, cf. P. Lévy, loc. cit. °, pp. 339-341. 


638 E. R. VAN KAMPEN AND AUREL WINTNER. 


It is clear from this definition that 


(5) | pn || © || Ym || whenever {Ym} is a subsequence of {¢n}, 
and that 
(6) if p(x) C || dn ||, then p(a—c) C || dn || for every c. 
(V) If $n? and ¢,? are congruent for n — 1, 2,- - -, then the two func- 


tion sets || $n’ ||, || on? || are identical. 
This is clear from the definitions. 


LemMa 1. If p' and p? are contained in a function set || dn ||, then either 
the two functions p*, p? are congruent or the two point sets |p[, |p?[, defined 
under (1), have no point in common. 


- Proof. Suppose, if possible, that p, p? are not congruent and ]p'[, ]p?[ 
do have a point in common. Then, on interchanging, if necessary, p' and p’, 
there clearly exists an 2 = x, for which 


(7) p'(— ©) < p?(Xo) < p*(+ 


Since p’ C || || and p? C || ¢, ||, there exist two sequences of numbers, say 
{cn'} and {c,?}, such that 


(8) > Cn®) > p*(2). 


Now the sequence of the differences ¢,1— cn? contains a subsequence which 
tends either to a finite limit c or to — o or to + o. In the first case (8) 
implies that p*(z) = p*(a—c), which is a contradiction, since p' and p? are 
not congruent, by hypothesis. In the second case (8) clearly implies that 
p°(x) Sp'(— ~). Since this contradicts (7), and since the third case can 
be treated in the same way as the second case, the proof of Lemma 1 is complete. 


Lemma 2. If {pm} is a sequence of functions contained in a funclion 
set || on ||, then || pm || © || dn |. 


Proof. If «© || pm || and o is not a constant function, then Lemma 1 
implies that o(x) and pm(a) are congruent for every sufficiently large m, 
so that o C || gn ||, by (6). Hence it is sufficient to prove that every constant 
function « contained in || pm || is contained in || ¢, ||. In view of (V), one 
can assume without loss of generality that pm(x) >a as m-—> oo. Then there 
exists for every « > 0 and for every ¢ > 0 an M = M(e,t) such that 


(9) | pm(+ t) —a| <e for every m= M(e,t). 


Sin 
ren 
nul 


(11 
wh 


Nc 


H 


fo 


Fur 
(m, 
Hei 
mos 
is 
Cor 
(1 
On 
Siz 
on 
ig 
TI 
(1 
it 


ON DIVERGENT INFINITE CONVOLUTIONS. 639 


Furthermore, since pm C || dn || for m —1,2,-- -, there exist constants cy” 
(m,n =1,2,-- -) such that 

pm(Z) as n> (m == 1,2,- °°). 
Hence if ¢ > 0 is such that neither z —?t or x = —t is contained in the at 


most enumerable set which consists of the points x at which at least one pm 
is discontinuous, then, by the definition of the symbol —, one can choose an 
N= N(e,m,t) such that 
| t — —pm(+ t)| <e for every n= N(e,m, t) ; 
(m =1,2,: °°). 
Consequently, from (9), 


(10) | —a| < , if ma M(e,t), n= N(e,m,?t). 


Since ¢n(x) is monotone and @ is independent of a, it is clear that (10) 
remains valid if one replaces t —¢n™) by Cn™), where is any 
number between — 7 and ¢t. Hence 


(11) | dbn(a@—en™) —a| < if |x| St, mZ2M(t),nZN(bt), 
where 


M(t) = M(t7,t), N(t) M(t), ¢); (e==t-1). 
One can clearly assume that 


Since ¢ is any positive number not belonging to an at most enumerable set, 
one can choose = +, where ~ «© ask—o. For a fixed n which 
is not less N(t,), let the integer k =k, be defined by the condition 


AT 
N (te,) Sn < N (tins): 
Then 
(12) ky,» — and © as n> ow, 
Now, on placing 
dn = Cy”, where m = M(tx,), 


it is clear from (11) that 
| dn(a@— dn) —a| < 2,7, if Stk, nZN(k,). 
Hence (12) implies that 
dn) > 4, n—> 0, 


for every z. Thus «C || ¢, ||. This completes the proof of Lemma 2. 


640 E. R. VAN KAMPEN AND AUREL WINTNER. 


(VI) If p(x) is contained in || ¢, ||, then so are the constant functions 
a@=p(—o) and a—p(+ 0). 
This is clear from Lemma 2 


pm & || dn ||, by (6). 


, since if pm(x) =p(x+m), then 


Distribution functions. A monotone non-decreasing function ¢$(z) is 
said to be a distribution function if the set ]@[, defined in (1), is the interval 
0<y< lie, if ¢6(— ©) =0 and ¢(+ 0) —1. By the spectrum of a 
distribution function ¢(z) is meant the set of those points z =z, for which 

The metric defined under (II), when applied to the space of aH distribu- 
tion functions, defines a topology which is equivalent to the one defined by the 
symbol ¢,—¢. In fact, since (3,.) and (33) are satisfied if ¢n and @ are 
distribution functions, (III) clearly implies 


(VII) If ¢1,¢2,- - + and ¢ are distribution functions, then ¢, > ¢ if 
and only if | ¢n;¢|— 0. 

An obvious corollary of (VII) is the well-known fact that if ¢@n and ¢ 
are distribution functions and ¢ is continuous for —o <a#2< + o, then 
¢, cannot tend to ¢ unless the convergence is uniform for -—- 0 <%#< + om. 
It is clear from (2) that (VII) implies 


(VIII) If are distribution functions, then there exists a dis- 
tribution function @ satisfying ¢, > ¢ if and only if lim | dn; ¢m | =0. 
n=% 


Throughout the paper, use will be made of the following notation: 
(IX) If x(x) is a monotone function for which 


0S x(— 0) 


let there be defined for every positive number ¢ a distribution function [x]: 
by placing 
cr<—t; 
Thus [x(x)]+ is a distribution function which is defined for every dis- 


tribution function x(x) and for certain functions x(x) which are not distribu- 
tion functions. 


nor 
if 


tril 
sta 
the 
not 
In 
tio 
sul 
sul 
tat 
int 
| no 
in 
on 
on 
pre 
nu 
fo 
ta 
(1 
fo 
of 


ns 


Nh 


is 


ON DIVERGENT INFINITE CONVOLUTIONS. 641 


(X) If {Wm} is a sequence of distribution functions and p a monotone 
non-decreasing function which need not be a distribution function, then ym — p 
if and only if |[Wm]z; [p]+|—>0 for every fixed ¢ > 0. 

This is clear from (VII) and (IX). 


The set || ¢, || in case of distribution functions ¢,. If 4, ¢2,- - - are dis- 
tribution functions, it is easily seen from (IV) that || ¢n || contains the con- 
stant functions ¢ = 0 and a1. It is shown by the example 


don(x) = 4(1+sgn2), densi = 3(1 + = are tan 2) 


that the set || ¢, || belonging to a sequence {¢,} of distribution functions need 
not contain any function distinct from the constant functions «= 0 and «=—1. 
In particular, the set || dn || belonging to a sequence {dn} of distribution func- 
tions need not contain a distribution function. 


THEOREM 1. Every sequence {¢n} of distribution functions contains a 
subsequence {wm} which has the following properties: 

(i) There exists an at most enumerable set of mutually disjoint open 
subintervals a <y < Bx of OS yZ1 such that a constant function is con- 
tained in || wm || if and only if its value y is not contained in any of these 
intervals <y < Bx. 

(ii) There exists for each of these intervals a << y< Bx a monotone 
non-decreasing function px(x) such that the set |px[, defined in (1), ts the 
interval a << y < Bx and a non-constant function is contained in || Ym || if and 


only if it is congruent with pe(@). 


The example mentioned before shows that Theorem 1 becomes false if 
one replaces a suitably chosen subsequence {Wm} of {dn} by {gn} itself. The 
proof of Theorem 1 will be based on the following facts (XI), (XII), (XIII): 


(XI) If {¢n} is a sequence of distribution functions and s a given 
number such that 0 = s = 1, then {¢,} contains a subsequence {ym} such that 
for some element p* = p*(x) of the function set || Ym || the number s is con- 


tained in ]p*[. 
Proof of (X1). The assumptions of (XI) clearly imply that 
(13) Ss on(bn + 0) 


for every n and for some number 6 = by. Since {n(z + bn) } is a sequence 


of distribution functions, it contains a convergent subsequence. Let {Ym(x) } 


642 E. R. VAN KAMPEN AND AUREL WINTNER. 


and {4m} be the corresponding subsequences of {¢n(z)} and {— bn} respec- 
tively and let o(x) denote the limit function of {wm(a—dam)}. Since o(z) 
is contained in || ym ||, so are, by (VI), the constant functions¢(+ «). Hence, 
on placing p*(x) =o(x) or p*(x) =o(+ ) according as the number s is 
contained in Jo[ or is equal to o(+ ©), the statement of (XI) follows. 


(XII) Every sequence {¢,} of distribution functions contains a sub- 
sequence {y%m} such that there exists for every rational number r, where 
0 =r=1, an element p” of || Ym || for which r is contained in ]p’[. The 


Proof of (XII) follows from (XI) by a straight-forward application of 
the diagonal principle. 


(XIII) If a sequence {Ym} of distribution functions is such that || ym | 
contains for every rational number r, where 0S r<1, an element p” for 
which ]p’[ contains r, then || Ym || contains for every real number y, where 
0S an element for which ]p”[ contains y. 


Proof of (XIII). In the proof that p” exists for a given y, it may be 
assumed without loss of generality that y is neither contained in a ]p"[ nor 
is y a boundary point of a |p"[; cf. (VI). It is clear that, under these 
assumptions, 

p'™(r) > y, 


whenever the sequence {r,} of rational numbers is such that rn > y. It follows, 
therefore, from Lemma 2 that the constant function «=—y is contained in 
| én ||. Thus the requirement of (XIII) is satisfied by the constant function 


= y. 


Proof of Theorem 1. Let {Wm} be a subsequence of {dn} such that || Ym | 
has the property stated under (XII). Then there exists, by (XIII), for every 
y, where 0S y=1,a p¥C || Ym || such that y is contained in ]p”[. If u andv 
are two distinct y-values and neither p” nor p” is a constant function, then 
Lemma 1 implies that the two open intervals ]p"[, ]p’[ are either disjoint or 
coincident, and that in the latter case p” and p” are congruent. Consequently, 
there exists in the interval 0S y= 1 an at most enumerable set of mutually 
disjoint open intervals a, << y < x such that an open y-interval is an interval 
a <y < fx if and only if it is an interval |p’[ belonging to some non-constant 
function p’, where 0S yX1. Hence it is clear from (XIII) that if a 
number y, where 0 S y <1, is in none of the intervals a, < y < Bx, then the 


cons 
Len 


of 

con 
con 
{Wn 

wh 
a st 

(1 
| Sin 
the 
| tio 
| tio 
to 

cor 
| fur 
eve 
a1 
(1; 

(1 

[ 

(1 


ON DIVERGENT INFINITE CONVOLUTIONS. 643 


constant function is contained in || ||. On combining this with 
Lemma 2, Theorem 1 follows. 

It is understood that the open set formed by the intervals a, << y < Bx 
of Theorem 1 can be the empty set. 


(XIV) Ifa subsequence {Wm} of a sequence {dn} of distribution func- 
tions satisfies the requirements (i), (ii) of Theorem 1, then || Ym || either 
contains a distribution function or it contains a constant function a, where 
1. 

This is clear from Theorem 1. 


(XV) If a sequence {¢,} of distribution functions is such that not every 


constant function « (0 = a= 1) is contained in || dn ||, then some subsequence 
{Ym} of {gn} is such that || wm || contains a non-constant function. 


Proof. Suppose that there exists a constant function ¢ (0S¢=1) 
which is not contained in || dn ||. Then «40 anda~1. Hence there exists 
a sequence {c,} of numbers such that 


(14) —0) SaZGn(Cn+ 0). 


Since the sequence of the distribution functions ¢n(x + ¢n) cannot tend to 
the constant function «, it contains a subsequence which tends to a limit func- 
tion p(x) sa. Hence it is clear from (14) that p(x) is not a constant func- 
tion. Finally, if {Wm(x)} is that subsequence of {¢n(x)} which corresponds 
to the subsequence of {¢n(@-+ ¢n)} defining p(x), then pC ||y¥m||. This 
completes the proof of (XV). 


THEOREM 2. Jf a subsequence {yx} of a sequence {gn} of distribution 
functions satisfies the requirements of Theorem 1, then one can choose for 
every «>> 0 and for every t >0 an M=M(e,t) >0 such that 


(i) there exists for every m > M(e,t) and for every function pC || yx || 
a number c = Cm for which 
(15) [p(x)]e] 


(ii) there exists for every m > M(e,t) and every number ca p such that 
(15) is satisfied for this p—=pm*. It is understood that the symbols |; | and 


[ ]t are those defined under (II) and (IX). 
Proof. For a fixed e > 0, choose K = K, values y; such that 


(16) ye —1 and — < Fe. 


644 E. R. VAN KAMPEN AND AUREL WINTNER. 


By Theorem 1, there exists for every 7 an element x = yj of || yx || such that 
the y-set ]x;[ contains the point y= y;. Choose a number a; such that 


xi(4j —9) = xi + 0) 
and put 
= +45). 
Then 
(17) 73(— 0) S7;(+ 9), 


and 7;(x) is, by (6), contained in || yx ||. It follows, therefore, from (X) 
that one can choose for every j and for every t > 0 an N = Nj(e,t) such that 
(18) | [Wm(a@— ej") Jor | < de for every m > Nj(e, t), 

if the number c;” is suitably chosen. Notice that cj” can be chosen as in- 
dependent of «. On placing 


(19) M(e,t) = Max t)’), where K = K,, 


the statement (i) of Theorem 2 may be proved as follows: 

Let p be a given element of || yx ||. If p(x) is of the form 7;(z— )), 
where j = 1,: - -, K, and b is a number between — ¢ and ¢, then (i) is clear 
from (18) and (19). Suppose, therefore, that p(x) is for no 7 of the form 
7;(a— b), where |6| <¢. Then, on the one hand, there exists by Lemma 1 
a 7 such that 

and, on the other hand, for this 7, 


as seen from Lemma 1 and from (17). It follows, therefore, from (16) that, 
for this j, 
+ < de; 


cf. (II) and (IX). Since, by (18) and (19), 

[[Ym(a + t— ej”) + t)]t| < de for m > 1), 
it is seen from (1;) that 

+ [p(a) ]e] <4e+ Fe for m > M(e,t). 


Hence, (i) in Theorem 2 is satisfied by c = cj" —t. 
The converse statement of Theorem 2, namely (ii), is similarly proved. 


unit 


and 


tha 


Sir 


an 


| 
(20 
|| 
It 1 
(21 
on? 
of 
(X 
C¢ 


ON DIVERGENT INFINITE CONVOLUTIONS. 645 
Convolutions. If ¢,,¢,. are two distribution functions, there exists a 
unique distribution function ¢, * ¢2 which is defined by 


+00 


and is called the convolution of ¢,(2) and ¢.(xz). It is known that 
* bo = * gp, and * do) * = gi * (G2 * 
It is seen from (20) that, for arbitrary numbers ¢,, Co, 
(21) * C2) = C, — Cz), where = * $2(z). 
Use will be made also of the following facts: * 


(XVI) If bn?(x), $?(x) are distribution functions such 
that dn! > and ¢?, then dn? * dn? > * 


(XVII) If on?(x), are distribution functions such that 
oni > and dn! * dn? > then dn? >, where w(x) = 
A fact similar to (XVII) is 


(XVIII) If ¢n?(x) are distribution functions such that 
on * bn? > w, where w(z) = 4(1-+ sgn), then there exists a sequence {cn} 


of numbers such that + and Cn) 
Proof of (XVIII). On placing on = ¢n!* dn?, the assumption of 
(XVIII) is that, for every e > 0 and for some VN = N(e), 
1—e < wn(€) —on(—e), if n = N(e). 

Since dn’ * gn’, 

; 

-00 
and so one can assume that there exists a u for which 
wn(e) — wn(—e) = dr) (e — gn} (—e—Uu). 

Consequently, if c.” denotes this u, 


1—e < dn} (e — ce") — bn! (— e — if n= N(e). 


®B. Jessen and A. Wintner, loc. cit. +, Section 3. 


14 


646 E. R. VAN KAMPEN AND AUREL WINTNER. 


One can assume that V(1/k) < N(1/(k + 1)), where k =1,2,---. Fora 
given n > N(1), define an integer k =k, by the requirement that 


N(1/kn) <n 
Then kn > «© as n—> o and, for every n, 
1 ky gn’ (Cn + ky") (Cn ky), 


where c, denotes the negative value of for e=k,. Since 0 as 


n— , it follows that 
dni (x + en) ~$(1+sgnz) 
This, when combined with (21), clearly completes the proof of (XVIII). 


A lemma on convolutions. The object of this section is the proof of a 
somewhat involved fact which isolates an essential part in the proof of Theorem 
3 of the next section. The lemma in question (Lemma 3) is to the effect that 
if a convolution process ¢; *:, when applied to a do, flattens 2 strongly in 
the large, then also the local flattening of #2 must be quite strong. Certain 
weaker results to the same effect have been indicated by Lévy.° 


Lemma 3. Let A(x), p(x) be two distribution functions such that there 
exist four positive constants p, q, t, s which have the following properties: 


(i) 6¢ < p(1—p) and p <1, hence q <1; 
(ii) A(x + —A(z) Spt q for every z; 
(iii) A(s) —a(—s) 21—q; 

(iv) p(t) —p(—t) = p—2q; 
(v) p(t + 2s) —p(—t—2s) Sp+gq. 


Then there cannot exist adistribution function v(x) such that (x) =p (2). 


Proof. Suppose, if possible, that there exists a distribution function v(z) 


such that 


+00 


= f A(x —u)dv(u). 


Then, from (iv), 


st+t +00 
f + f +f [A(E—u) —A(—t Ja(u), 


whe: 


whil 


so tl 


On 


whe: 


satis 


Con 


and 


= 
= 
Sinc 
be @ 
func 
(22 
-0O 
-00 —s-t = 


ON DIVERGENT INFINITE CONVOLUTIONS. 647 


where, according to (ii), 


st+t 


[ ]dv(u) S [ls +t) 


-3-t 


while, according to (ili), 


{f +f fe <9 ff =4 


p—2qSqr (P+ +t) 


so that 


On the other hand, from (v) and from the assumption p =A * y, 


p+q=nl(t + 2s) —p(—t— 2s) 


+00 


[A(2s + t—u) —A(—2s—t—1u) ]dv(u), 


+00 -8-t 
where f =f , and so, A(x) being a non-decreasing function which 


satisfies (ili), 


p+q= (1—q)[v(s + t) 3 — 
Consequently, by the inequality found before for p — 2q, 


p—2S9q4+ (P+ 


and so 
p— p* = 3q(1 + p). 


Since this contradicts (i), the proof of Lemma 3 is complete. 


Convolution sequences. A sequence {¢,} of distribution functions will 
be called a convolution sequence if there exists a sequence {on} of distribution 
functions such that 


(22) gn = 0; *° on, 1. @., hn = * On = 91, == 2,3,° 
(XIX) Every subsequence of a convolution sequence is a convolution 


Sequence, 


This is clear from x; * (x2 * xz) = (x1 * x2) * Xs 


x -00 


648 E. R. VAN KAMPEN AND AUREL WINTNER. 


(XX) If {¢n(z)} is a convolution sequence, then so is {¢n(x— ¢n)} 
for every sequence of numbers ¢p. 
This is clear from (21). 


THEOREM 3. If, for a given convolution sequence {on}, one denotes by 
| pn lo the function set obtained from || ¢n|| by omitting the two constant 
functions « = 0, a =1, then either every element of || dn || ts a constant func- 
tion or every element of || dn |\o ts a distribution function. In the first case 
every constant function a, where 0 = <1, ts an element of || on ||. In the 
second case there exists a distribution function p such that a function is an 
element of || dn ||o if and only if it is congruent with p. In the first case the 
convolution sequence {¢n} will be said to be flat, in the second case non-flat. 


The proof is somewhat lengthy and will be decomposed. into the following 
steps (XXI,), (XXI,): 


(XXI,) If {¢n} is a convolution sequence, then every non-constant 
function contained in the function set || ¢, || is a distribution function. 


Proof of (XXI,). Let {ym} be a subsequence of {dn} such that {ym} has 
the properties described in Theorem 1. Thus there exists a number p= 0 
such that every pC || ym || has, for — oo <2#<-+ o, a total variation not 
greater than p, while the total variation of some po C || Ym || is equal to p. 
It is seen from (5) that (XXI,) will be proved if one shows that every non- 
constant function contained in || ¥m || is a distribution function. Suppose, if 
possible, that there exists in || Ym || a non-constant function which is not a 
distribution function. Then, by the definition of po, 


(23) 0<p<1, where p—po(+ ©) —po(— @). 


Hence one can choose a g > 0 for which condition (i) of Lemma 3 is satisfied. 
It is also seen from (23) that, for some t¢ > 0, 


(24) po(t) —po(—t) > p—gq. 
On the other hand, by the definition of p, 
p=p(+ ©) —p(— ©) = p(x + 2t) — 


for every pC || %m || and for every z. Hence (i) of Theorem 2 assures the 
existence of a Wx C {wm} such that 


Yu (a + 2t) S p+ q for every z. 


con 


On 
dis 
sat 
{Cm 
He 
On 
anc 
seq 
the 
Siz 
is 
(X 
{xk 
pa 
(2: 
ho 
Sir 
vol 
Ane 
(2 
ant 
(2 
H 
(2: 


ON DIVERGENT INFINITE CONVOLUTIONS. 649 


On denoting this yx by A, condition (ii) of Lemma 3 is satisfied. Since d is a 
distribution function and q > 0, there clearly exists a number s > 0 which 
satisfies condition (iii) of Lemma 3. Since po C || Ym ||, there exists a sequence 
{cm} of numbers such that, at every continuity point 2 of po, 


Wm — Cm) — po(z), 0. 


Hence it is seen from (24) and (23) that there exists ® a yi C {ym} such that 
and 

— cr) —pi(—t—er) > p—q—Y 

yi(t + 2s — cr) —yi(—t—2%s—er) Cp 


On denoting the distribution function yi(a—c.) by p(x), conditions (iv) 
and (v) of Lemma 3 are satisfied. Since {ym} is, by (XIX), a convolution 


sequence, and since 
A(x) = yx cr) andl >k, 


there exists, by (20) and (21), a distribution function such that A\*v= 
Since this contradicts Lemma 3, the proof of (XXI,) is complete. 


(XXI,) If a distribution function p is contained in || ¥m||, where {Ym} 
is a subsequence of a convolution sequence {dn}, then p is contained in || dn |. 


Proof of (XXI.). Suppose, if possible, that (XXI,) is false. Then 
(XXI,) implies that the set || yx ||) belonging to a suitably chosen subsequence 
{y:} of {¢n} contains a function 7 which is not congruent with p. In 
particular, 

(25) Tk > 7, k— 


holds for a sequence of distribution functions 7(2) of the form xx%(“%— cx). 
Since Ym(x) and r(x) are of the form ¢n(z— an), and since {¢n} is a con- 
volution sequence, there exist, by (21), for every m two distribution functions 
An; pm and two positive integers k, j such that 


(26) Yn * Am = Tr, Where k asm—> 
and 

(27) Tk * pm = Wj, Where 7 > m. 
Hence 

(28) Yin * Am * om = Wj for some j > m. 


* The discontinuity points of p., if any, do not interfere with the possibility of this 
conclusion, since they form an at most enumerable set. 


650 E. R. VAN KAMPEN AND AUREL WINTNER. 


Since pC || Ym ||, one can assume without loss of generality [cf. (IV) and 
(21) ] that Ym—p as m— oo. Then (28) implies, in view of (XVII), that 
Am * pm —>o. It follows, therefore, from (XVIII) that, for a suitably chosen 
sequence of numbers bm, 


(29) bm) as m—> 
Since, from (26) and (25), 
(30) Yn (22) * dm (22) > 


and since r(x) is not one of the constant functions equal to 0 or 1, it is easily 
inferred from (XVI) and (31) that the numbers b» tend to a limit D as 
m— >. Hence (29) can be replaced by 


(31) +b). 


It follows, therefore, from (30) (XVI) and from the definition ym(2) — p(z) 
of p that 
(32) p(x) +b) —x(z). 


This means in view of the definition of that p(# +b) = (za), i.e., that 
7 is a distribution function and that p and z are congruent. Since this contra- 
dicts the assumption, the proof of (X XI.) is complete. 


Proof of Theorem 3. Suppose that || ¢, || does not contain all constant 
functions «, where 0 >a=1. Then {¢,} contains, by (XV), a subsequence 
{Ym} such that || Ym || contains a non-constant function, say p. This p is, 
by (XXI,), a distribution function. Furthermore, pC || ¢n ||, by (XXI,). 
Finally, it is seen from (6) and Lemma 1 that a function is contained in 
| gn lo if and only if it is congruent with p. This completes the proof of 
Theorem 3. 

There arises the question how to decide whether or not a given convolution 
sequence {¢,} is flat in the sense of Theorem 3. In order to obtain criteria 
to this effect, put 

+00 
(33, ) E(x) rdy(x), 


-00 


if the distribution function y has a first moment, which is certainly the case 
if x has a finite second moment 


(332) F(x) 


P 
(33 
Th 
(34 
whe 
F( 
(35 
Sin 
(35 
Cor 
He 
the 
it 
(3 
Th 
lai 
1. 
(2 
to 
ti 
3 
-0 


ON DIVERGENT INFINITE CONVOLUTIONS. 651 


In the latter case, let 


(33) D(x) = F(x) — [E(x) J. 
Thus D(x) = f [2 — E(x) ]?dx(a) ; hence 
(34) D(x) = 0, 


where D(x) =O if and only if x is congruent with »=4$(1+sgnz). If 
F(x) = + put D(x) =+ 
It is clear from (33,) that 


Similarly, from (332), 

(352) F(x1) = F (x2) + 2cH (xe) + 0°, 1f xi = 
Consequently, from (33), 

(35) D(xi1) =D (xz), if x: and x2 are congruent. 

Hence, on combining (IV), (21) and Theorem 3 with known ’° criteria for 
the convergence of infinite convolutions, it is seen that Theorem 3 can be 
completed by 


THEOREM 4. Jn order that a convolution sequence {dn} be non-flat, 
it is sufficient that, on using the notations (33) and (22), 


(36) D(om) [ef. (34) ]. 


This sufficient condition is necessary as well in case there exists a sufficiently 
large L > 0 such that the spectrum of every on is a subset of an a-interval 
of suitably chosen position and of length L, where L is independent of n. 


Convergent infinite convolutions. Let {¢,} be a convolution sequence, 
1.e., a sequence of distribution functions which can be represented in the form 
(22). It is clear that {¢,} can converge, in the sense of (3,), to a @ which is 
not a distribution function. The infinite convolution =o, * is said 
to be a convergent infinite convolution only if (22) satisfies (3,) with a func- 
tion @ which is a distribution function. On using this terminology, Theorem 
3 clearly implies 


10B, Jessen and A. Wintner, loc. cit. +, Theorem 4 and Theorem 5. 


(35, ) E (x1) = H( x2) + c, if (2) = ¢). 
n=1 


652 E. R. VAN KAMPEN AND AUREL WINTNER. 


~ 


THEOREM 5. If a convolution sequence 


(37) {gn} = {or * on} 


tends, mm the sense of (3,), to a limit function $, and if the infinite 


convolution = 9, * is not convergent, then is a constant function, 


Theorem 3 also implies ** 


THEOREM 6. There exists for every non-flat convolution sequence (37) 
a sequence {Cn} of numbers for which the infinite convolution 


(38) 


is convergent. If {Cn} and {cn7} are two such sequences {cn}, then the two 
corresponding infinite convolutions (38) represent congruent distribution 
functions. 

If, on the other hand, a convolution sequence (37) is flat, then the infinite 
convolution (38) is divergent for every {Cn}. 


In what follows, use will be made of the following fact which is merely 
a restatement ** of a part of the fundamental result of Khintchine and Kol- 
mogoroff ? concerning “ equivalent ” series of independent random variables: 


(XXII) The infinite convolution o, * o,*- - - is convergent if and only 
if so is the infinite convolution [o,]+ * for a > 0 (in which case 
the same holds for every ¢ >0). It is understood that [o]+ denotes the 
distribution function defined in (IX). 

On combining (XXII) with the known convergence criterion *° for a con- 
volution of distribution functions with uniformly bounded spectra, one obtains 


(XXIII) The infinite convolution o, *o.*-- + is convergent if and 
only if so are both series 


SE([on]t),  %D([on]e) [ef. (33,), (33)] 
n=1 n=1 
for a t > 0 (in which case the same holds for every t > 0). 


Absolutely convergent infinite convolutions. Two infinite convolutions, 
* oo *- and 0,” * 0,” -, will be said to be rearrangements of each 


11This theorem has been stated without a detailed proof by P. Lévy, loc. cit. °; 
p. 340; cf. also p. 337. 
12 Cf. B. Jessen and A. Wintner, loc. cit. *, Theorems 32 and 34. 


an 


oth 
infi 
real 
in 

lute 
0; ( 
vol: 
say 
(3: 
(4 
wh 
(4. 
Th 
He 
lut 
i. 
Ne 


ON DIVERGENT INFINITE CONVOLUTIONS. 653 


other if the sequences {o,’} and {on} are permutations of each other. An 
infinite convolution o; *o,*- - - is said to be absolutely convergent if every 
rearrangement of it is a convergent infinite convolution. It is known ® that 
in this case all rearrangements represent the same distribution function. The 
definition of an absolutely convergent infinite convolution clearly implies 


(XXIV) Both (XXII) and (XXIII) remain valid if one reads “ abso- 


lutely convergent ” instead of “ convergent.” 


THEOREM 7%. There exists for every convergent infinite convolution 
a sequence {cn} of numbers such that the infinite con- 
Cy) + is absolutely convergent. Needless to 


volution ¢,) * 
say, another sequence, {Gn}, of numbers has this property if and only if 
|c: —é,| +|c2—é|-+- - - is a convergent series. 


Proof. Choose a fixed ¢ > 0, put 
(39) = [on(x) Jor, 
where [ ].+ is defined by (IX), and let 
(40,) =pn(x@—en); (402) = xn(2%) — Cn), 
where, on using the notation (33,), the number ¢, is chosen as follows: 
(41) or cn = (pn) according as | E(pn)| >t or | E(pn)| St. 
Thus | cn | St, and so (39), (40,), (402) and (IX) imply that 

[xn]¢ = 


Hence (XXIV) and (40.) show that the infinite convolution (38) is abso- 
lutely convergent if and only if so is the infinite convolution 7, (x) *r2(z) 
i.e., if and only if so are both series 


(42,) SE (tm); (422) D(tm). 


Now (42,) is absolutely convergent, since, on the one hand, 


E (tn) = E(pn) —¢n, by (35,) and (40,), 


and, on the other hand, 


E(pn) =n for every sufficiently large n, 


654 E. R. VAN KAMPEN AND AUREL WINTNER. 


as seen from (41) and from the fact that = EF (pn) is, by (XXIII), (39) and 


the assumption of Theorem 7, a convergent series. On using (35) instead of 
(35,) and (34) instead of (41), it is similarly shown that (42.) is absolutely 
convergent. Consequently, the infinite convolution (38) is absolutely con- 
vergent. This completes the proof of Theorem 7. 


Theorem 7 implies, in view of Theorem 6, 


THEOREM 8. A convolution sequence (37) is non-flat if and only tf there 
exists a sequence {Cn} of numbers such that the infinite convolution (38) is 


absolutely convergent.” 
Theorem 8 and Theorem 3 imply 


THEOREM 9. If two sequences {on}, {on} of distribution functions are 
permutations of each other, then the two function sets || dn’ |lo, || dn” lo, where 
gn =; * on, bn’ = 01" * on”, are identical function sets. In 
other words, either both convolution sequences {dn’}, {bn} are flat or both are 
non-flat, and in the latter case the two distribution functions p’, p”, which 


| pn’ llo, || bn” lo determine up to congruences, are congruent. 


Theorems 3 and 8 can be interpreted as describing the possible behavior 
of any divergent infinite convolution. Correspondingly, Theorem 9 describes 
what can. happen to a non-absolutely convergent infinite convolution upon an 
arbitrary rearrangement of its “factors.” In the non-flat case, Theorems 8 
and 9 imply the more precise fact that the infinite convolution behaves upon 
a rearrangement exactly the same way as a certain numerical series, which is 


determined by the sequence of the “factors” up to an additive absolutely 


convergent numerical series. 


THE JOHNS HOPKINS UNIVERSITY. 


by ] 


i n=1 AN 
(E) 
in t 
the 
| (1) 
i whe 
fun 
ther 
(4 
equi 
(2) 
and 
(3) 
by t 
diss 
as a 
tion 


AN ANALOGUE OF JACOBI’S CONDITION FOR THE PROBLEM 
OF MAYER WITH VARIABLE END POINTS.* 


By THoMAS FREEMAN COPE. 


Introduction. The problem of Mayer with variable end points, as stated 
by Bliss,* is the determination of the properties of an arc 


(E) yi = yi(2), (1—1,---,n), 
which minimizes the first of a set of functions 
fo[ 1,02, y(@1), J, (p = 


in the class of similar arcs which make f.,- - -,f- vanish and besides satisfy 


the differential equations 


(1) = 9, (a=1,---,m<n), 
where y(x) is a symbol for y,- + -y. Bliss has shown ? that if denotes the 
function 


then a necessary condition for a minimum is that there shall exist m functions 


+,Am(a), not all identically zero on 2,22, satisfying the differential 
equations 

d 
(2) Oy, — Ay, = 0, 


and making all determinants of order r + 1 of the matrix 


fous fovie 


(3 
— Oy, (2; irs (21), — (22) + (£2) — Oy, (22) 


* Presented to the American Mathematical Society (Chicago), April, 1928. Received 
by the Editors March 1, 1937. This paper is a somewhat revised form of the author’s 
dissertation, see 4. The great interest shown in the problem and methods of this paper, 
as attested by the bibliography at the end, seemed to the author to justify its publica- 
tion at this time. 

The numbers in the footnotes refer to the bibliography at the end where further 
references will be found. 

‘ 655 


656 THOMAS FREEMAN COPE. 


vanish. According to the usual convention of tensor analysis, it is understood 
here and elsewhere in this paper that a subscript 7, j, a, B, etc., repeated in the 
same term indicates a sum. The subscripts z, y, y/ denote partial derivatives, 
It is also understood that the arguments in fp, 2, and their derivatives are 
those belonging to E. 

The preceding statement of the necessary condition is equivalent to the 
statement * that there must exist m functions Ag(z), not all identically zero on 
222, satisfying the equations (2), and r constants 1,,- - -,1,, not all zero, 
satisfying with them the 2n + 2 equations 


(4) fay + is —OQy, (2, — fuse —Q,, (a2) = 0, 
where 


In the first section of this paper, the hypotheses on which the analysis is 
based, are stated and preliminary notions and theorems considered. The first 
and second variations are computed in section 2. It is shown in section 3 that 
a minimum problem for the second variation may be formulated and moreover 
that it can be transformed into a problem of the same type as the original one. 
The differential equations and boundary conditions corresponding to (2) and 
(4) above for this auxiliary minimum problem are then given. In section 4 
a boundary value problem associated with the second variation is stated and 
discussed, and by means of it a necessary condition for the original minimum 
problem is proved. This condition is essentially that for a minimizing arc for 
the original problem the boundary value problem of the second variation can 
have no solution for negative values of its parameter. It is then shown in 
section 5 that the boundary value problem of the second variation can be 
transformed into one that has been treated by Bliss. Section 6 is devoted to 
the task of proving that the transformed boundary value problem is “ definitely 
self-adjoint,” according to Bliss’s definition. From this fact much information 
is automatically obtained about the characteristic constants and solutions of 
the original boundary value problem. 

The form of the second variation appearing in section 2, the analogue 
of the Jacobi necessary condition of section 4, and the discussion of the 
boundary value problem of the second variation of sections 5 and 6 were first 
given, for the general problem of Mayer with variable end points, in my dis- 
sertation (see 4). Proofs of similar analogues have since been published by 
Morse (see 7, p. 524) and Reid (see 14, p. 840). The same authors have also 


p. 31l. 


the 


al 


tre 
diff 
A 
tio 
p. 
(5 
wi 
of 
(¢ 
Ww 
€ 
T 
( 
(§ 
Ww 
T 
de 


AN ANALOGUE OF JACOBI’S CONDITION 657 


treated the boundary value problem of the second variation by methods quite 
different from mine and from one another (see 7, pp. 542-546, and 10). 
Analogues of the Jacobi necessary condition different in form from those men- 
tioned above have been given by Bliss (see 8, p. 266) and Hestenes (see 12, 
p. 483). 


1. Preliminary notions and theorems. The arc FH is supposed to have 
the following properties: * 


1. It is of class C’” and such that the functions ¢a, fp are of class C!V in 
a neighborhood FR of the values (2, y, y’) on £. 


2. It satisfies the equations (1) and’ 

(5) fo = 0, 
3. The matrix || day’, || has rank m at every point of £. 
4, The matrix 


| fea: fae I|, (p = 1,- 1), 


with 2n + 2 columns and r rows is of rank r at the values of the arguments 
of the functions fp on FE. 
Consider a one-parameter family of arcs ° 


(6) yi = Yi(z,€), 
Xi(e) SetSX(e), 2X,(0) (c= 1,2), 


which contains H for «0 and satisfies the equations (1) and (5) for every 
ein a neighborhood of e— 0. Its variations are by definition the expressions 


= X,.(0), Y ie(2, 0). 


The variations then satisfy the equations of variation 


(7) (2, 1) 1) = + = 9, (a *,m), 
(8) Fo(é,4) = 9, (o = 
where 
(9) Fo 0) = + + fovanis 
-+- (fpz. + + (p r). 

The functions y, y’ occurring explicitly and in the derivatives are those 
defining 

Consider now a system H of r sets of variations &/, ni? (p =1,° 


*1, p. 307. pi ser. 


i 


658 THOMAS FREEMAN COPE. 


t= 1,2), with 7’s of class C’”’ and satisfying the equations (7). Variations 
of this sort with €, and & arbitrary constants are called admissible variations, 
A minimizing arc F# is said to be normal for the problem under consideration 
when a system H of variations can be so selected that the matrix 


|| Fo n°) |, 


has rank r—1. Let an admissible arc be defined as an arc of class C’” on 2,2,, 
whose elements (2, y,y’) all lie in R, and which satisfies the equations (1), 
The following theorem and its corollary with their proofs, which are omitted 
here, are similar to those given by Bliss in his lectures at Chicago in the 
summer of 1925.° 


THEOREM. For every normal minimizing arc E of class C’”’ on 2,22 for 
the Mayer problem with variable end points, ther? exists a one-parameter 
family of admissible arcs (6) containing E for «=O and satisfying the 
equations (5). The functions Y; are of class C’” in w, and Yi, Y’i, X, 
of class C” ine, neara, 


CoroLiary. If a set of admissible variations é, n(x) for a normal mini- 
mizing arc E of class C”’ on a4. for the Mayer problem with variable end 
points satisfies the equations (8), there exists a one-parameter family of ad- 
missible arcs (6) satisfying the end conditions (5), containing the arc E for 
«= 0, and having the set é, » as its variations along E. The functions Y; are 
of class C’” in and Yi, Y’;, X, of class CO” in near a, S a, = 0. 


2. The first and second variations. Consider the minimizing arc F of 
the corollary. There must exist m functions Ag(x) of class C’, not all identi- 
cally zero on 2,%2, satisfying the equations (2), and r constants Ip, not all zero, 
satisfying the equations (4). Moreover, since £ is normal, 7, must be dif- 
ferent from zero.” 1, may then be chosen equal to unity, and in the following 
pages it will be assumed that such a choice of 1, has been made. 

Substitute the one-parameter family of the corollary in the functions 
fp, ¢a and differentiate with respect to «. The result is 


oh _ ¥.), 


= Fo(X,, 


*5, pp. 694-695, and 4, pp. 6-9. 
71, p. 311. 


wh 


whe 
and 
res} 
(10 
wh 
| all 
by 
the 
(1 
of 
sil 
be 
(1 
F 
by 
| 
W 


AN ANALOGUE OF JACOBI’S CONDITION 659 


where Fp and ®, with the arguments indicated are defined by equations (7) 
and (9), but with coefficients taken for e—e. Differentiating again with 


respect to « and putting « = 0, we find 


(<2) = Fy (Xee Yee) + 2fiy,YjeXe + |? + Qi(Xe, Ve), 
(10) 0 Fo(Xee, Y ce) Y"jeXe |? fay, |? + o(Xe, Ye), 
0) ©, (2, Y ec, +- (2, Ye, 


where (1, Qo, 2q are quadratic forms in the arguments indicated, and where 
all the arguments are taken fore 0. Multiply the first 7 equations of (10) 
by 1,, lo, respectively, and add. Then because of the equations (4) of section 1, 


the sum can be written 


| €=0 X4(€) X 
(11) ) = Oy, Y jec + 20y jeXe lpQp(Xe, Ye); 


de €) 2( 


where all the elements are taken for e 0. Multiply now the last m equations 
of (10) by Ag and add. The result is, if we put » = Agog, 


0 = Qy, View + Oy jee + 2u(2, Ye, 
d 


d 
=} jee(Qy, dz > dz (Qy,Y jee) 


d 
dz (Qy,Y jee) Rw, 


since for e=0, the Euler-Lagrange equations (2) are true. Integrating 
between X,(e) and X.(e) for «= 0, we obtain 


Xo(€) Xo(€) 
+ 2w (2, Y., Y’.) dx. 


| 


(12) 0 = Oy, V jee 


Finally by multiplying the equations 


(2, Y., Y’.) 
by A. and adding, we find 
whence 
| X2(€) X2(€) 


| X1(€) 


Where as before we take e—0. Now add (11), (12), and (13). The result 
will be the desired form of the second variation, after putting 


660 THOMAS FREEMAN COPE. 


namely, 


where the quadratic forms 2w and Q are explicitly 
Q = Aé,? + + + 2D (41) + 2E 
+ (21) Eo + 2G inj Eo + (1) 95 (21) 
21459 (22) + 95 (2) 5 
= = Oy y's 5 


A= (fos + (fer + s(t2)); 


(15) d d 
C= Fy, (fos + Dy Fon — 


d d d 
E; fue Fi = G5 = Fy, fun + (22), 


If y; = y:(x) is a minimizing arc it is evidently necessary that the first 


variation J,, 
d 
(2) 
€ 


vanish for all sets é, » satisfying the differential equations and end conditions 
(7) and (8). It is also necessary that the second variation J, be greater than 
or equal to zero for the same sets €, 7. 


3. The minimum problem of the second variation. A problem suggest- 
ing itself at this point is that of minimizing the second variation (14) in the 


class of all sets of variations é, 7 of class C’” on 2,42, satisfying the equations 


(16) dz (2,7, = 0, (= 1,2; @==1,---,m), 
Fo(é,n) = 9, 


It is evident that the second variation J, must be greater than or equal to zero 
in the class of all such sets é,7. It is found convenient to put a further 
restriction on the sets é,y. Let us introduce the equation 


(17) 4&2 + n(2)m(2)de— 1. 


We then consider a second problem of the second variation and its relation to 


the 
tio 
shi 
cla 
set 
ad. 
eV! 
pr 
set 
pre 
cla 
th 


foo 
cla 


all 


va 
an 
th 
by 


No | 


Le 
by 
los 
an 
pr 


wl 


AN ANALOGUE OF JACOBI’S CONDITION 661 


the one first proposed. This second problem is to minimize the second varia- 
tion in the class of all sets é, satisfying the equations (16) and (17). We 
shall call €, an admissible set for the first problem if the és and 7’s are of 
class C’” on 2,22, and if the set satisfies the equations (16) ; and an admissible 
set for the second problem if the set further satisfies the equation (17). Every 
admissible set for the first problem, except the trivial one é=—7—0, will 
evidently give rise to a set ké, ky, which is an admissible set for the second 
problem, where / is a constant different from zero. Moreover, every admissible 
set £m of the second problem is necessarily an admissible set of the first 
problem. It follows that the admissible sets of the second problem form a sub- 
class of the class of all admissible sets of the first problem. We can then state 


the following 


THEOREM. Jf the second variation I2(é,) 1s greater than or equal to zero 
for all sets €, which satisfy the equations (16), where the &s and ys are of 
class C’” on 2,2, then Is(&,4) is necessarily greater than or equal to zero for 
all such sets €,» which satisfy both the equations (16) and (17) and conversely. 


The problem that will now be considered is that of minimizing the second 
variation J2(é,7) in the class of all sets €,7 which satisfy the equations (16) 
and (17) with 2s and 7s of class C” on x,a,. This problem is not quite in 
the form of the original problem in zy-space, but may be changed to that form 
by the introduction of new variables. For, let 70(@), qnii(x) be defined by 


x 
no(x) = (a, n (x) = f ni(x) ni (x) da, 


1 


2 
no (21) = (), no 2w (2, Ns )da, = 0, Hn+1 ninide. 


Let also the values of the constants €,, € at the end points x, and a. be denoted 
by €11, go: and &,2, €.2 respectively, and consider (as we obviously may with no 


loss of generality) only é,, and &,» as occurring in the equations (15) and (16) 


and also in the second variation. We can then formulate an auxiliary Mayer 


problem as follows: 
To find the properties of a set of functions 


no(2), ni(@), (2), é,(z), 
which minimizes the expression 
n=1,= no(t2) + Q (Ess, E225 ni ), 


in the class of such sets of functions satisfying the differential equations 


15 


662 THOMAS FREEMAN COPE. 


10 — 2w(2, 1) = 0, (2, > 1) = payjni + j an®, 
(1,7 a—1,---,m), 


and the end conditions, 


92 = = 0, 
Jou = Fo €22, ni(21), ni(%2)) = 0, (o amB,* >,#), 
=u (41) = 9, 
Gris + + —1 = 0, 


with é’s and 7’s of class C’” on 2,4. 
This auxiliary problem is a Mayer problem of the type considered in the 
introduction and so a minimizing set of functions 


(19) 


no(x), ni (x), (2), E,(x), 


of class C’” on x,x, must satisfy the conditions there stated.* Let jy» be the 
multiplier associated with — 2; pa(x), (a =1,- - +, m), those associated 
with fms2, those associated with 2, respectively; and, finally, p, 
that associated with 7’n.1 — ynini, where the p’s are of class C’ on 2,25. Define 
Ir by the equation 


= po (7/0 — 20) + + + + — 


Then the Euler differential equations for the auxiliary problem are 


d d 


It follows from the (nm + 2)-nd of these equations that p is a constant. 

It should now be observed that a minimizing arc for this problem is surely 
a “normal” arc. For, from the form of the equations (18), it follows that 
the values of 0, €, at and the value of é at x =, are entirely 
arbitrary. By hypothesis there exist r sets of admissible variations (é, 7’), 
=1,---,7), so that the matrix || Po(&,7’)||, (0 has rank 
r—1. Hence from this and the arbitrariness of the end values of 70, qni1, &1» 
as just described, there must exist 7+ 3 sets of admissible variations 
Nos Nis Nns15 £15 €2, Such that the matrix 


Gy(&, 7°) ||, (v = -,r+3; §=1,: -,r+3), 


has rank r+ 2, where Gy(é,7) =0 are the equations of variation corre- 
sponding to the equation (8). 


on 
bo 
Fr 
the 
thé 
T 
(2 
to 
(2 
wl 
(2 
| It 
La 
mi 
dij 
ze 
th 
sa 
it 
se 
Wi 
eq 


AN ANALOGUE OF JACOBI’S CONDITION 663 


Let Ls be the r + 3 constants associated with gs with L,=1. Let 6 take 
on successively the values yo, ni, 9n+1, 1,2. Then from equations (4), the 
boundary conditions for the auxiliary minimum problem are seen to be 


Lsgso, — 21 = 0, + 21 = 0. 


From the three equations in which 62 = 92 = 91 = yns1,1, it follows 


that Lr.3 = — p, wo = —1, Lrg =p, respectively. It then follows from the 
equation in which 0, = yo: that L,———1. When 0, = &,. and 0, = 2, it is seen 
that = mig = 0. Now re-name Lrs1, respectively, Lr. 


The remaining boundary conditions have the form 


Ve, Lo (fox, + 0, 


(20) + Lo (fox, + iz) 0, 
Onis + Lofoyis + (22) 0, 


The Euler differential equations of the auxiliary problem are now seen 
to be 


(21) Ty, = Ta 
where 
(22) Ty, + Ty, —= On, + PaPay; — 


It is to be noted that the undetermined constants and multipliers are now 
La, pt, a(x) where (o 
The results of this section may be summarized as follows: 


Let &1, (x) be a set of functions of class C’” on a, S 4S which 
minimizes the second variation I,(é,4) in the class of such sets satisfying the 
differential equations and the end conditions (16) and (17). Then there must 
exist m multipliers (a =1,- -,m), of class C’ and not all identically 
zero On 2%, and r constants p, Le, (o =2,:°-+,7), not all zero, satisfying 
the differential equations (21) and the boundary conditions (20), in which 
£11 and &. may be replaced by é, and & respectively. 


4, The boundary value problem of the second variation and a neces- 
sary condition for the original problem. From the results of the last section 
it may be seen that there is a boundary value problem associated with the 
second variation which may be stated as follows: 

To determine multipliers ya(z) of class C’ and constants Lo, p, together 
with variations €,, £,7:(2) of class O’” on 2,22, which satisfy the differential 


equations 


664 THOMAS FREEMAN COPE. 


(t=1,---,n; 


and the boundary conditions (20) and (8). 

It will now be shown that there are restrictions upon the possible values 
of » for which the boundary value problem has solutions, in view of the mini- 
mizing properties of the arc # for the original minimum problem considered. 
Suppose that the set €,, yi (x), Lo, » is a non-trivial solution of the boundary 
value problem and that it also satisfies the equation (17) ; this last condition 
could always be met for any given set by multiplying by a suitably chosen 
positive constant. Multiply the equations (20) by 4i(%1), i (22), 
respectively, and add. Then since Q is a quadratic form in 1, &22, 7i(@1) and 
ni(x2), and because of the equations (8), there will result 


(24) 20 | + — 0. 


But from equations (22), (15) and (7), it follows that 
d d 


= 2w + paPa — 


whence, in view of the equations (21), 


Ze Ze 
-{ 2udx — ninide. 


After substituting the right-hand side of this equation in the second term of 
(24) and recalling that 


ily, 


it is seen that : 


— (Er? + + = 0, 
and hence on account of (17) that ba 
I, = p. 
The following necessary condition on F has thus been proved: 
If E is a minimizing arc for the original minimum problem in xy-space, 
then the boundary value problem of the second variation described in this 


section, the equations of which are (23), (20), and (8), can have no solutions 
for negative values of the parameter uz. 


It 


8a 


prc 
sol 
of 
the 
Fo 
the 
Z2 
It 
th 
(2 
ha 
C0! 
de 
= 
| 


AN ANALOGUE OF JACOBI’S CONDITION 665 


5. Transformation of the boundary value problem of the last section. 
The boundary value problem of the last section will now be transformed into a 
problem that has been discussed by Bliss.° It will first be assumed that a 
solution 


E,, ni(%), pa(x), Lo, p, 1,2; -,2) 


of the boundary value problem of the last section for a minimizing are # for 
the original minimum problem has been found. Note that 


Poni Pons foie, (o = 2,5 °°, r). 


For convenience, let é,, 2 be re-named respectively, and adjoin to 
the boundary value problem of the last section the equations 


Pras (a, ) — (x2) 0, nns2 (21) nn+2 (Ze) = (0). 


It is then clear in view of these equations and hypothesis 4 of section 1 that 
the matrix 


has rank r-+ 1. As a matter of notation, let T,, be renamed ¢;, and for 
convenience in subsequent proofs, let us introduce two new functions fni1, fns2 
defined by the equations 


The boundary conditions of an equivalent boundary value problem are then 


bln, — =0, 
QF + 2En+2,1 = 0), 
(26) Lok + Ons + =0, 
+1, + + = 9, 
LoP + — + 2En+2,2 = 9, 

F, = 0, 


It is to be observed that Fo and Q do not contain. 9n41,2 and yns2,1- 
Suppose now that a and bi, ({—1,:--,n-+ 2), are 2n + 4 constants 
satisfying the equations 


= 


666 THOMAS FREEMAN COPE. 


(27) AF + = 0, (r= 2,---,r+2). 
In view of the rank of the matrix (25), there will be 
linearly independent sets of such constants, say 
at, Dye, (A =1,2,---,2n+3—r). 


Multiply the first n + 2 equations of (26) by az, the second set of n+ 2 
equations by by, and add. The result will be because of (27) 


s=—1,:--,n+2). 


The quadratic form Q with arguments 7.1, ye2 may be written 


Q(n(2:1); ) Astneints + 2Betnernte + 
(s,t=1,- 


where A, and Cs: may without loss of generality be taken symmetric. It is 
to be emphasized that Q does not contain yns1,2, qns2,1- Now define || 8's || to 
be the matrix such that 8.,—1 if s=—¢—n-+1 and otherwise zero, and 
|| 8s¢ || to be the matrix such that 8”.:=1 if s=t—n- 2 and otherwise 
zero. The boundary conditions (26) then imply the 2n + 4 conditions 


— + + — + | nes 
+ [Drs (Cet — pdt) + AneBet | nto = 0, 
F;(n) = + 0, 


In matrix notation these equations may be written 


Ans(Ast post) + bysBes, — 
0 bv1) 
(28) TNtW 
Drs (Cst po’ st) + AysBst, 


v2 == (), 
0 (nt2, bee) 
A—1,---,2n+3—r; r—2,---,r+2). 


The differential equations 

a 

dz 

(29) $, (2, 1) => pays Ni “++ day’ 0, ns2 — 0, 


is 


Ci 


Ww 
R 
al 

t 
i 
r 
t 
t 
| 


AN ANALOGUE OF JACOBI’S CONDITION 667 


will now be reduced to an equivalent set which is for our purposes more useful.?° 
Recalling (22) and (15), let Haj, Ka; be defined with ¢, by the equations 


and assume as usual for regular problems in the calculus of variations that 
the determinant 


Rij 


2 (a, 8 Lap = 0), 


is different from zero on 2, =[x%S2,. Then the m + n equations 


= + + Keine, 
0 = Hajnj + Kai's, 


can be solved for 7’;, ug. These solutions are 
1 i i 1 pa 
= — (Be! Ka‘ Hains) Bibs, 


1 1 
Ba = — (Kal + Le*Hpini) + 
(k =1,: 
in which, with reference to D, FR,‘ is the cofactor of Rxi; Ka‘, Lg, the cofactors, 


respectively, of Kai, Iga. Note that the matrix || A,‘ || is symmetric. With 
the help of (29) it is seen that 


= + + Haipa — 


When the values of 7; and po given above are substituted in these equations, 
they become 


1 1 
= Pijnj — D Qin( + + D Qinkd 


1 
— + + 5 


Hy, K vie; — poi, 


If D is different from zero, the differential equations (29) are then equivalent 
to the following set in matrix notation: 


1015, p. 590. 


668 THOMAS FREEMAN COPE. 


+ Ka4Hat), 
D"Qae Qiu + — D* Hyg ( Kv"Qiu + 
(30) (QasRi* + 
+4 
a,8,v—1,---,m), 


(1, 


where 8,1 is the Kronecker symbol if qg and 1 are = n and otherwise zero, 
If any one of I, g, s, u in a coefficient in (30) is greater than n, that coefficient 
is zero. 

If D is different from zero, the boundary value problem of the last section 
can then be transformed into the following boundary value problem: 

To determine functions fg 
with 7’s of class C’””’, @’s of class C” on 2,22, and a constant p satisfying the 
differential equations (30) and the boundary conditions (28). It is clear that, 
under the hypotheses, every solution of the boundary value problem of section 4 
whose equations are (23), (20), and (8), for a minimizing arc F of the 
original problem, is a solution of the transformed problem just stated. 


6. The self-adjoint character of the auxiliary boundary value problem. 
The auxiliary boundary value problem of the last section whose equations are 
(30) and (28) will now be shown to be “self-adjoint ” according to the 
definition of Bliss.1t In the paper referred to, it is proved that a necessary 
and sufficient condition for the self-adjointness of a boundary value problem 


(Asa(2) + pBia(2))¥e(2), 1S See, 


(31) dz 
M iaYa(2:) + NiaYa(X2) = 0, 
is that there exist a transformation 7;;,(z) such that 


-t- Agil ax ix 0, T Bail ax 0, 
(1, j,k, a, B =1,- 


The functions Bix(x), Tix(x), T’ix(v) are assumed to be continuous 
functions of x with | Ti, | 40 on 2,22 and Mix, Nix are constants with n the 
rank of the matrix || Mix, Nix |. 

The boundary value problem given by equations (28) and (30) is evi- 
dently of the form 


123, p. 569. 


a ibe 


( 
| ( 
| 
i 
t 
0 
0 
0 
( 
f 


AN ANALOGUE OF JACOBI’S CONDITION 669 


Bay 
dz 


Di; 


+ 


ol) (Yi, 


(yi 2; (#2) ) = 9, 


(33) Bn By 


Fp; 


Gis 0 | 


where, in view of (28) and (27), we suppose 


(34) — FraGia = 0, 


This is clearly of the same type as (31). 

It will now be shown that with the transformation 7’, which with its 
inverse has the form 


—&, 


8; j 0 


the equations (32) are satisfied for the boundary value problem of the last 
section. To do this, we consider what properties the matrices || Ai; |],---, || Gi; | 
of (33) must have if the equations (32) are to be satisfied. For the first set 
of equations of (32) to be satisfied, it is necessary and sufficient that 


| Sia Aaj Baj Agi Cai | 0 8aj | 
—8ia 0 Day Bai Dail) ||—8aj |’ 
or 

| ‘OFF; Di; | Aji 

| 0. 

This is possible if and only if 

Cis = Ba; = By, Ai = — 


or, in other words, the matrices || Ci; || and || Bi; || must be symmetric, and the 
matrix || A;; || must be equal to the negative of the transpose of the matrix 
| Di; ||. It readily follows that the second set of equations (32) will be satis- 
fied if and only if 


Cij = Chay 


that is, the matrix || Oi; || must be symmetric. These conditions are easily 
verified to be fulfilled for the corresponding matrices of the equations (30). 
Hence for the transformation 7’, as defined above, the first two sets of equations 
(32) are satisfied for the boundary value problem of the last section. 

The left member of the third equation of (32) for the problem (33) is 


670 THOMAS FREEMAN COPE. 


Gia 0 Sag 0 Fo 0 
— Epp | Lop Gmp 
— Gig 0 |’ 


«B=—1,:--,n), 


and this last product is equal to 


| 


Similarly the right member of the third equation of (32) for the problem (33) 
is found to be 


— ap 0 


GipF gp, 0 


These two matrices will be equal if and only if 


= FogG mg, 


35) 
( — op = FopEap — ap. 


The first equation is true because of (34). The second equation may be verified 
for the boundary value problem of the last section. For, let the range of the 
subscripts of the last equations be 


B=1,---,n+2; l,m—2,---,r+2. 


Then substituting in (35) the values which make the matrices in (33) identical 
with those in (28), we find that the left and right members, respectively, of 
(35) are 


(— pg) (4qs( Asp — + basBps) — (dps (Asp — + DpsBgs) (— 8); 
(bpp) (Cop — pd" sp) + dqsBsp) — (bps (Cop — + dpsBeg) (bap). 


Because of the symmetry of the matrices || Age ||, || Cet ||, |] 8st || and |] 8st ||, 
these two expressions are equal if 


AppbqsBps = IpsBegbag, 
and 


= bpptqsBep. 


These, however, are true equations, as we may see by interchanging certain 
of the summation subscripts, and hence the third equation of (32) is satisfied 
for the boundary value problem of the last section. 


ad 
Si 

of 
pe 

- W 
01 
e 


AN ANALOGUE OF JACOBI’S CONDITION 671 


All three of the equations (32) are then satisfied for the auxiliary 
boundary value problem of the last section, from which it follows that this 
problem is self-adjoint. 

The auxiliary boundary value problem is, moreover, “ definitely” self- 
adjoint, according to the definition of Bliss. Let Six be defined by 
Six(x) = TaiBax. The boundary value problem (31) is then said to be 
“definitely ” self-adjoint 1 if the following conditions are fulfilled: (1) The 
equations (32) are satisfied for the transformation Ty.(z). (2) The matrix 
| Six(x) || is symmetric. (3) The bilinear form Sgg(z) fafg formed for a set 
of numbers f; and their conjugate imaginaries f; is positive or zero at every 
point of 2,42. (4) This form vanishes identically for a set of solutions f; (x) 
of a system of equations 


f’i(z) = Aia(r) + Bia(%) ga(x), (4, k, >”), 


where the g;’s are continuous functions of x on 2,22 but otherwise arbitrary, 

only when the functions f;(x) are all identically zero. These conditions are 

easily seen to be satisfied for the auxiliary boundary value problem of the last 
0 


ol- 
— 


section since S has the value 


Sot 0 


0 0 


> 


The theory of definitely self-adjoint boundary value problems as developed 
by Bliss?* may now be applied in full to the problem of the last section. 
According to that theory, the characteristic constants » for which this problem 
has solutions must all be real and are denumerably infinite in number, and the 
linearly independent characteristic solutions corresponding to each character- 
istic constant may be chosen real. Since every solution of the problem of 
section 4 is also a solution of the problem of the last section, it follows that the 
characteristic constants » for that problem are also all real, and the linearly 
independent characteristic solutions corresponding to each characteristic con- 
stant may be chosen real. By the criterion of section 4, none of the char- 
acteristic parameter values for the problem of section 4 can be negative if FH is 
a minimizing are for the original problem in «xy-space. 


MARIETTA COLLEGE, 
Marietta, OHIO. 


723, p. 570. 


** 3, p. 571. 


672 THOMAS FREEMAN COPE. 


BIBLIOGRAPHY. 


1. Bliss, “The problem of Mayer with variable end points,” Transactions of the 
American Mathematical Society, vol. 19 (1918), pp. 305-314. 

2. Bliss, “A boundary value problem of the calculus of variations,” Bulletin of 
the American Mathematical Society, vol. 32 (1926), pp. 317-331. 

3. Bliss, “A boundary value problem for a system of ordinary linear differentia] 
equations of the first order,” Transactions of the American Mathematical Society, vol. 28 
(1926), pp. 561-584. 

4. Cope, “An analogue of Jacobi’s condition for the problem of Mayer with 
variable end points,” Dissertation (1927), The University of Chicago. 

5. Bliss, “The problem of Lagrange in the calculus of variations,” American 
Journal of Mathematics, vol. 52 (1930), pp. 673-744. 

6. Morse and Myers, “The problems of Lagrange and Mayer with variable end 
points,” Proceedings of the American Academy of Arts and Sciences, vol. 66 (1931), 
pp. 235-253. 

7. Morse, “Sufficient conditions in the problem of Lagrange with variable end 
conditions,” American Journal of Mathematics, vol. 53 (1931), pp. 517-546. 

8. Bliss, “The problem of Bolza in the calculus of variations,” Annals of Mathe- 
matics, vol. 33 (1932), pp. 261-274. 

9. Myers, “ Adjoint systems in the problem of Mayer under general end-conditions,” 
Bulletin of the American Mathematical Society, vol. 38 (1932), pp. 303-312. 

10. Reid, “A boundary value problem associated with the calculus of variations,” 
American Journal of Mathematics, vol. 54 (1932), pp. 769-780. 

11. Bliss and Hestenes, “ Sufficient conditions for a problem of Mayer in the calculus 
ransactions of the American Mathematical Society, Vol. 35 (1933), 


of variations, 
pp- 305-326. 
12. Hestenes, “ Sufficient conditions for a problem of Mayer with variable end 
points,” Transactions of the American Mathematical Society, vol. 35 (1933), pp. 479-490. 
13. Morse, “ The calculus of variations in the large,” American Mathematical Society 
Colloquium Publications, vol. 18 (1934). 
14. Reid, “ Analogues of the Jacobi cong@tion for the problem of Mayer in the 
calculus of variations,” Annals of Mathematics, vol. 35 (1934), pp. 836-848. 
15. Bolza, Vorlesungen tiber Variationsrechnung (1909). 


o< 
exp. 
to | 
of t 


the 
fing 
the 
nul 


are 
vie" 
can 
fer 


ter! 


C/ 
for 
foll 
log 
dis 
pre 


ant 


= 


he 


ON THE ASYMPTOTIC DISTRIBUTION OF ¢’/é(s) IN THE 
CRITICAL STRIP.* 


By RicHarD KERSHNER and AUREL WINTNER. 


According to Bohr,t Euler’s product for {(s), which is divergent for 
o < 1, has for o > } certain convergence tendencies. In particular, the formal 
expansion of log £(s) is, for every fixed o > $, convergent in relative measure 
to log £(s), and this holds uniformly for all o > const. > $4. Bohr’s proof 


of this fact consists in first replacing {(s) by 


(1 — 23-8) (—1)"*/n’, where o > 0, 
n=1 
then applying Schnee’s mean-value theorem uniformly for o > const. > $, and 
finally removing the factor (1— 2**). Thus the method does not apply to 
the function? ¢’/£(s), a function more directly connected with the prime 
number distribution than {(s) itself. 

In the present note there will first be obtained results for ¢’/£(s) which 
are analogous to those of Bohr for log £(s). This will be made possible in 
view of a general principle which may be formulated roughly as follows: One 
cannot lose uniform convergence in relative measure by term-by-term dif- 
ferentiation of a sequence of analytic functions (the same does not hold for 
term by term integration). 

The result is then applied to obtain for the asymptotic distribution of 
(’/€(s), where « > 4, results which are analogous to those previously obtained * 
for log £(s) or £(s). Due to the convergence in relative measure, these results 
follow immediately from the general“heory of infinite convolutions,* since the 
logarithms of the prime numbers are linearly independent. The asymptotic 
distribution of ¢’/£(s) has recently * been discussed foro > 1. The facts to be 
proved seem to be new not only for 4 < 0 < 1 but for o=1 as well. 

The results are independent of Riemann’s hypothesis. 

Let sn on + itn denote those zeros, if any, of £(s) for which on >4, 
and let Jn be the interval 


t = tn, (<1), 


* Received January 29, 1937. 

* Bohr [1], [2}. 

* By is meant ¢’(s)/¢(s). 

3 Jessen and Wintner [3]. 

‘van Kampen and Wintner [4], Section 6. 


673 


of 

al 

28 

n 

| 


674 RICHARD KERSHNER AND AUREL WINTNER. 


an interval which is perpendicular to the critical line. Let I denote the set 


obtained from the half-plane o > 4 by removing the segments Jy and also 


the segment 
t= 0, 


Thus T is open, simply-connected, and contains the half-plane o >1. On 
placing 


(1) = pe); (n—=0,1,2,- -) 
it is clear that 


Let log {,(s) denote that logarithm of {,(s) which is regular analytic in I 
and vanishes as o—>-+ «©. The considerations of the sequel will be based 
on a classical result concerning the manner of divergence of Euler’s product 
in the critical strip, namely on 
Bohr’s Lemma.> Corresponding to three arbitrary numbers «4, & 
satisfying 
0<aK<l 0<e<l, 


there exists an N = N(y,«,¢2) with the property that one can choose for 
every n= N a up such that if 7 = up, then the interval 27ST contains 
a finite number of mutually disjoint subintervals J which have a total length 
greater than (1— e,)T and are such that if 7 is in an J, then on the one hand 
the half-strip 


of the (o + it)-plane is contained in I and on the other hand 
| log fn(s) | = €1 
for every s =o + it in any of these half-strips. 
Let =, denote the boundary of a square of side 4 in the complex s-plane 
and let &s, where 0 <8 < 4, denote the boundary of that square of side 


$ — 26 which is symmetrically placed in 3%. Then, if f(s) is any function 
regular analytic on and within 3, its derivative f’(s) satisfies the inequality 


(3) max | f(s)| Sjemax|f(s)|,  (0<3 <3), 


regardless of the position of %, in the s-plane. In fact, if s is in the interior 
of X», then, according to Cauchy, 


5 Bohr [1], Hilfssatz 5, p. 82. The numbers denoted above by E55 557, N,u,, are in 
Bohr’s notation e”, — 3, N*, T* respectively. 


Le 
th 
ev 
a 

t 
is 
f 


Set 


ON THE ASYMPTOTIC DISTRIBUTION OF ¢'/{(s). 


emif’(s) (s f(w) dw. 


Now if s is on or within 3, then |s—w|=8 for every w on Xo, so that 
(3) is obvious. Let 3, be any square of side $ which has one side on the line 
o=4-+7 and is contained in one of the half-strips mentioned in Bohr’s 
Lemma. On applying (3) to each of these squares 3%, and to the function 
f(s) = log n(s), it is seen that Bohr’s Lemma implies the following: 
Corresponding to four arbitrary numbers 4, 7, «1, satisfying 


0<8<4, 0<aK<l, 
there exists an N = N(y,€,,€2) with the property that one can choose for 
every n= WN a uz such that if T = ua, then the interval 2=+7rST contains 
a finite number of mutually disjoint subintervals 7 which have a total length 
greater than (1 —e,)7 and are such that if +r is in an J, then on the one hand 
the square 


3-8 
is contained in I and on the other hand 
| | S «1/8? 


for every s =o + it in any of these squares. 
Now if o, is a fixed value such that 4 < a) S 1, one can choose 7 > 0 and 
§> 0 such that 


Let § and » be fixed. Then, on placing «' —«,/8? and applying the above 
result only for a fixed o =, one obtains the following corollary: If o) > 4 
is fixed,® there exists for every pair of positive numbers ¢’, e, an N = N(e’, e2) 
with the property that one can choose for every n= WN a Un such that if 
T = us, then 


| Un/tn(oo + it)| Se 


whenever ¢ is on a certain subset of the interval 2=¢< 7, and this subset 
has a measure greater than (1—e,.)7. Since ¢',¢. are independent of each 
other, it follows, by letting «, 0, — and writing o and e instead of 
o, and e', that, if o > 4 is fixed, then, for every fixed « > 0, 


(4) lim lim sup meas {| + it)| > = 0, 
ter 


N—>+00 T+ 


where the factor of 1/7’ denotes the linear measure of the set of those values 


> 1, the statement is trivial. 


675 
ik 
ed 
ct 
€2 
or 
§ 
h 
d 


676 RICHARD KERSHNER AND AUREL WINTNER. 


t for which 


O0<t<T and | +it)| >« 
Thus (4) means that if o > 4 is fixed, the sequence 


tends in relative measure * to the function of t which is = 0. It is clear from 
the above proof that the convergence in relative measure is uniform for all 
o if >} is fixed. On writing 
fn(t)[]f(t), n>+o, 
if fn(¢t) tends in relative measure to f(t), i.e., if 
1 
lim lim sup = meas {| f(t) —fn(t)| > «} =0 


holds for every fixed e > 0, the above result may be formulated as follows: 
THEOREM I. If o > 3 is fixed, then Euler’s series 
©) -8 
> * log px 
(5) (s =o + tt), 


for the function £'/€(s) converges in relative measure to f'/f(o0 + it), ie, 
(6) pn(o + tt) [> ]0’/E(o + it), 


where pn is an abbreviation for 
n -8] 
(7) pr(8) =pn(o + it) —— 
1 pr 


Furthermore, (4) holds uniformly for o =o, if ¢ > 4 is fixed. 


In fact, if o > 1, then (5) may be written as the absolutely convergent 
Dirichlet series of ¢’/£(s), and so 


(8) = — pn(s) 


is, for ¢ > 1, obvious from the definitions (6) and (1) of pn(s) and &j(s). 
Now the function (7) is regular analytic in the half-plane o > 0, and this 
half-plane contains the domain T defined above. Since {(s) and ¢,(s) are 
regular analytic and distinct from zero in the simply connected domain T 
which contains the half-plane « > 1, it follows that (8) holds at every point s 
of fT. Consequently, (6) is equivalent to (4). 


7 Cf. Jessen and Wintner [3], Section 11. 


re 


th 


(5 
is 
co 
hy 
po 
fu 
ab 
is 
8 
fo 
2 
fo 
a 
p 
WwW 
p 


ON THE ASYMPTOTIC DISTRIBUTION OF ('/£(s). 677 


It may be mentioned that the function ¢’/{(o0 + it) to which the series 
(5) is convergent in relative measure is a continuous function of ¢t, if o > $ 
is not the abscissa of a zero of £ (supposing that there exist such zeros). The 
corresponding fact does not hold in Bohr’s case of log (s), if the Riemann 
hypothesis is false. In fact, if the Riemann hypothesis is false, then each 
point of the cuts J, occurring in the definition of T is a discontinuity of the 
function log + it) of t, if o >4 is fixed and < on, where op» is the 
abscissa of the right-hand end of J. 


Remark. Since ¢’/€(s) is regular for s = 1 + 1¢ 1, and since the series 


n=1 k=2 


co CoO 
{(1— log pn — log Pn} log Pn 
n=1 ‘ 
represents a regular function for o > 4 and so for s = 1 + it, it is clear from 
log pn log pm = O (log m), m—>-+ 
n=1 


that the series 


is, in virtue of a general theorem of M. Riesz,’ convergent for every 
s=1-+it+41. In other words, the series (5) is convergent for every 
s=1-+it+1. This does not imply, however, the statement of Theorem I 
for o=—1, since (5) is not uniformly convergent for all s—1- it, 
2<t<+o (if « <1, then (5) is clearly divergent for every ¢). Corre- 
spondingly, the following theorem seems to be new not only for 4 << @ < 1 but 


foro = 1 as well: 


THeorREM II. The function ¢'/f(c + it) possesses, for every fixed o > 3, 
an asymptotic distribution function. 


This theorem is® a consequence ?® of Theorem I, since the function 
pn(o + it) of ¢ is, according to (7), almost periodic in the sense of Bohr. 


THEorEM III. The asymptotic distribution function of ¢/f(o + it), 
where « > 4 is fixed, is absolutely continuous with a density which possesses 
partial derivatives of arbitrarily high order. If 4 <<o<1, then the density 
ts a transcendental entire function of two variables. 


°M. Riesz [5], p. 350. 
®Cf. Jessen and Wintner [3], Section 11. 
The statement of Theorem I as to uniformity with respect to o is not needed. 


16 


= 


678 RICHARD KERSHNER AND AUREL WINTNER. 


In fact, since the logarithms of the prime numbers are linearly independent, 
Theorem III follows from Theorem I by an obvious modification of methods 
previously applied ** to log £(o + it). The same holds * also for 


THEOREM IV. If $< the density of the asymptotic distribution 
function of the function f'/(o + tt) of 1 is everywhere positive but vanishes 
in the infinity as strongly as a Gaussian density. 


Needless to say, Theorem IV may be considered as an indication of the 
truth of Riemann’s hypothesis (which, in turn, does not imply Theorem IV). 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


{1] H. Bohr, “ Zur Theorie der Riemannschen Zetafunktion im kritischen Streifen,” 
Acta Mathematica, vol. 40 (1915), pp. 67-100. 

, “Uber diophantische Approximationen und ihre Anwendungen auf Dirich- 
letsche Reihen, besonders auf die Riemannsche Zetafunktion.” Proceedings of 
the Fifth Congress of Scandinavian Mathematicians, Helsingfors, 1922, pp. 
131-154. 

[3] B. Jessen and A. Wintner, “ Distribution functions and the Riemann zeta function,” 
Transactions of the American Mathematical Society, vol. 38 (1935), pp. 48-88. 

[4] E. R. van Kampen and A. Wintner, “ Convolutions of distributions on convex curves 
and the Riemann zeta function,” American Journal of Mathematics, vol. 59 
(1937), pp. 175-204. 

[5] M. Riesz, “ Ein Konvergenzsatz fiir Dirichletsche Reihen,” Acta Mathematica, vol. 
40 (1915), pp. 349-361. 


[2] 


11 Jessen and Wintner [3], Sections 8, 13, 14. 


1 
t 
h 
d 
4 
t 
0 

( 

f 

C 

t 

( 

( 


ON THE ADDITION OF CONVEX CURVES AND THE DENSITIES 
OF CERTAIN INFINITE CONVOLUTIONS.* 


By E. R. van KAMPEN. 


Let S,, Ss, - - be a sequence of convex curves in a plane such that the 
infinite convolution of the distribution functions * ¢2,° corresponding 
to S;, S2,- ++ is convergent. Under very general conditions it has been proved * 
that the distribution function ¢ has continuous partial derivatives of arbitrarily 
high order. If the curves S,, S2,- - - are analytic, the question arises whether 
@ is regular analytic anywhere in the plane. In certain special cases this 
question has already been investigated.* The object of the present paper is 
to develop a concise formalism for the description of the geometrical process 
of addition of convex curves. On using this formalism in connection with the 
study of ¢ a simple proof will be given in Section IV for the analyticity of the 
density of @ in certain specified regions. Although the boundaries of these 
regions are defined in such a way as to suggest their having a singular character 
for the density of ¢, it still remains a problem to decide under fairly general 
conditions whether or not it is possible for ¢ to be regular on these boundaries.* 
As an application it will be shown in Section V that there exist for the asymp- 
totic distribution function of the logarithm of the Riemann zeta function in 
addition to the ring shaped regions of analyticity previously determined,® 
certain regions of analyticity which may be described as crescent shaped. 

In Sections I, IT, ITI, in which the geometric considerations are given, the 
convex curves have been replaced by convex hypersurfaces in an n-dimensional 
space. This modification has no effect on the proofs but justifies the intro- 
duction of vector notations which lead to a slight simplification even if n = 2. 
The content of Section I is in the main a repetition of a treatment of the same 
problem by Kershner.® However, apart from the minor differences concerning 
the dimension of the containing space and the notation, the complete induction 


process used by Kershner has been replaced by a direct passage from the case 


* Received March 15, 1937. 

Cf. Jessen and Wintner [3], Sections 4 and 7. 

Tbid., Section 8. 

*Van Kampen and Wintner [6]. 

* Ibid., Appendix. 

5 Thid., Section 7. 

® Kershner [4]. 

679 


680 E. R. VAN KAMPEN. 


of 2 hypersurfaces to the general case, so that a limiting process for the case 
of infinitely many hypersurfaces becomes superfluous. The result of Section 
III is not used in what follows, but gives an insight in the character of the 
sets F and G introduced earlier. 


I. Vectorial sums of convex hypersurfaces. Let , y be variable vec- 
tors in a real finite-dimensional vector space such that » is of length 1 and 
let w+» denote the scalar product of these two vectors. By a convex hyper- 
surface will be understood the point set theoretic boundary of a non-empty 
bounded convex point set in the space. Thus in particular a convex hyper- 
surface may degenerate into a single point. 

Let S;, be a convex hypersurface. If the exterior normal to a supporting 
hyperplane of S; at the point 7 of S;, has the direction determined by wo, the 
notation 
(1) = 


will be used, even though (1) is in general not an allowable parametric repre- 
sentation of S;. The supporting function h;() of S; appears in the form 


(2) hi.(w) =w-m(o). 


The convex hypersurface S*; which is symmetrical to S; with respect to the 
origin may be represented by 
(3) 


and so its supporting function is 

(4) h* =hy(—o). 

The fact that S; is convex may be expressed by the inequality 
(5) m(o’) She(o), 


where the equality sign holds for » wo’. On substituting —w in (5) instead 
of w» one obtains 


(6) — wo) =o: m(o’) 

where the equality sign holds for » = —w’, in view of (2). It follows from 
(5) and (6) that 

(7) hi (w) + hi.(—o) = 0, 


which relation is also evident from the convexity of S;. A point y clearly is 
contained in the closed convex set determined by S; if and only if 


| 
e 
| 
| 
| 
I 


ASe 


ON THE ADDITION OF CONVEX CURVES. 681 


(8) o'nSly(o), 


for every », while such a point y is a point 7 = yx(o’) of S; if and only if the 
equality sign in (8) holds for o =o’. 


Let an infinite sequence of convex hypersurfaces 


(9) 
be given by (1) such that 


(10) | m(o)| < ak, << + 2%, 


for certain numbers a, and all ». Thus all series occurring in this Section are 
absolutely (uniformly) convergent. The particular case where only a finite 
number of hypersurfaces are given need not be excluded but will not be referred 
to. By a simple argument‘ it may be seen that the locus S represented by ° 


(11) 7 = (0) 
is a convex hypersurface and has the supporting function 
(12) h(wo) =X he(o). 
The locus represented by 
(13) = (x), 


where the o, vary independently, is the vectorial sum of the S, and will be 
denoted by 7. It follows from (5) and (13) by summation that 


(14) =h(o), 


for every point 7 in 7’, while in view of (2) the equality sign holds in (14) 
if w, = for every k, in which case y is a point in S also. Thus T is contained 
in the closed convex set C determined by S. On adding (2) for k =1 and 
(6) for all & > 1, it follows from (13) that 


h,(w) —Sh,(—o) S 


k>1 
if 7 is determined by (13) and » —o,. Thus the set U of points 7 defined by 


the inequality 


(15) h,(o) — hx (- 
k>1 
7 Haviland [2]. 
* This equation should be read in the sense that if for a given w and for one or 
more values of k the functions n, (w) are not univalued, then (11) represents all possible 


values taken by the sum on the right. 


__| 
on 
he 
nd 
ty 
|'- 
ig 


682 E, R. VAN KAMPEN. 
does not have a point in common with T although U is contained in C as a 
consequence of (7). From the form of (15) it is clear that the set U is either 
empty or an open convex set. On replacing w in (15) by —wo and adding the 
result to (15) one obtains 


(16) hi(w) + hy(—o) > hi(w) + he (—o), 


so that (16) is a necessary (but not sufficient) condition for the set U to be 
non-empty. It follows also that U must be empty except for at most one 
choice of 8; among the given S;. In order to assure that the correct choice 
of S, has been made, it will be assumed that the S; have been enumerated in 
such a way that 


(17) Max{e(o) + he(—o)} S Maxe{hee(o) + hin(—o)}, 1,2,-) 
A boundary point y of U clearly satisfies the inequality 
(18) m — 3 


while the equality sign in (18) holds for at least one value of wo, say for 
W = Wo. Put 
(19) =X m(—oo) and +7”. 

k>1 


On placing in (6) the variable w’ equal to — w, and adding the result to (18) 
for all k > 1, one obtains in view of (19) 


where the equality sign still holds for ow. Thus 7’ (oy), by the 
remark made in connection with (8), so that 4 is a point of the locus 
represented by 


(20) 7 =m(o) + 


Since R’ is contained in T it is clear that the boundary R of U is contained 
in T. Now it will be shown that T — C — U. 

Consider first the case where k = 1, 2 only and let 7 be a point which is 
in C but not in T. Obviously 


(21) no — n2(— ~m(o1), 


for every ;,2, so that the vectorial sum 7” of the convex hypersurfaces 
represented by the point 7 and by the convex hypersurface S*, does not have 
a point in common with S,. Obviously 7” is obtained from S*, by a trans- 


f 

f 

| 

| 

f 

| 


he 


ON THE ADDITION OF CONVEX CURVES. 683 


lation along the vector 7. It is impossible that 7’ and S, are exterior to each 
other, since otherwise the inequality (21) would hold for all » on a line 
joining 4 with oo, hence no point of this line would be in 7. This contra- 
dicts the assumption that ym is in C, since the boundary of C is in T. Also 
§, cannot be interior to J’, by (17). Thus 7” must be interior to 8;. This 
implies 


(22) No + he(—w) <hi(o), 


so that y is a point of U, and the statements that 7 —C — U is proved if 
only two hypersurfaces are considered. 
In the general case,® let again y be a point in C but not in T, so that 


for all w, since 4) is in C but not on the boundary 8 of C and 
(24) no F Xx n(x), 


for any sequence w;, 2,‘ It must be shown that satisfies (15). The 
inequality (24) may be written in the form 


(25) no — (— ox) A (ox), 


where 3’ denotes a summation over a set of distinct positive integers k and 
>” denotes the summation over the remaining positive integers k. It may 
be seen from (25) (by a repetition of the argument which follows (21)) that 
either 


(26a) o* > ha(w) — &” 
or 
(26b) o* < ha(w) — &” 


holds for given summations 3’, 3” and for every w. 
Now suppose, if possible, that 7 is not in U, so that y does not satisfy 
(15). Then the alternatives (26a), (26b) implies that 


(27) no > hi(o) 

k>1 
On the other hand it follows from (20) and from the absolute-uniform con- 
vergence of the series involved that 


(28) * No <2 hy(w) — 
‘Sp 


k>p 


*Only at this point the treatment becomes essentially different from the one given 
by Kershner [4]. 


= 


684 E. R. VAN KAMPEN. 


for a sufficiently large p and for every ». Thus the alternatives (23a), (23b) 
and (7) imply the existence of an integer / such that 


(29) Shu (w) —hi(—o) < + e(—o) < + hilo), 
k<l k<l k<l 
for every ». On denoting by S*, S* the convex hypersurfaces represented by 
k<l 
S?: =m (a) ; 


it is clear from (17) and (7) that the condition corresponding to (17) in the 
case of S*, S? is satisfied, so that the result obtained above in the case of two 


h?(w) = hi(w) 


hypersurfaces may be applied to S', S*. Thus, by (29) the convex hyper- 
surface represented by 
7 = qo — 
k>l 


is contained in the vectorial sum of S' and S?. But this is in contradiction 
with the assumption that 7 is not in 7. This completes the proof of the 
following 


THEOREM 1. Let S, (k=1,2,---), be a sequence of convex hyper- 
surfaces the representations (1) of which satisfy (10) and the supporting 
functions (2) of which satisfy (17). Let S be the convex hypersurface 
represented by (11), let C be the closed convex set determined by S and let 
U be the (possibly empty) open convex subset of C determined by (15). Then 
the vectorial sum T of 8,, is C—U. 


The supporting function hr(w) of the boundary RF of U satisfies the 
inequality 
hr(w) Sh,(o) —S hk (—o), 
k>1 


in view of the definition (15) of U. Moreover, if 7m is any point of R 
and w = w» is such that the equality sign holds in (18), then 


(30) hr(wo) = (wo) — hk (— oo) 
k>1 
in view of the expression (2) of the supporting function of a hypersurface. 


II. The hypersurfaces S;. Let again (9) be a sequence of convex hyper- 
surfaces and suppose, for simplicity, that the functions (1) defining (9) are 
univalued functions of w, i. e. that any supporting hyperplane of S; has exactly 
one point in common with S;; and that no degenerate Sj, occur, i.e. that in 


— 


i 
i 
{ 
| 
f 
H 
f 
‘ 
| 
| 
| 
| 
| 


3b) 


he 
v0 


ON THE ADDITION OF CONVEX CURVES. 685 


(7), the equality sign is excluded. In what follows any sum of the form 
Ym() will denote the zero-vector if the set of subscripts & over which the 
summation is taken is empty. Let «& (k =1,2,-- +), be —1, 0 or +1 and 
let H be a symbol of the form 


(31) f= (a, €2,° ° 


For a given symbol (31), let Sz denote the locus represented by 
(32) Sp: 7 =7,() 


The following remarks concerning these loci are quite obvious: 

(i) The symbol obtained from EF by replacing every «& by —e, de- 
termines again the locus Sz. Thus all loci Sz are obtained if one normalizes H 
by the restriction that the first «, in H which is not 0 shall be 1. 

(ii) The locus (32) which corresponds to the symbol (31) in which 
«, = 1 for every k, is the convex hypersurface denoted in Section I by 8S. 

(iii) Similarly the locus (32) which corresponds to any symbol # in 
which «& is either 0 or 1 for every & is a convex hypersurface with the sup- 
porting function 


(33) hn(o) = 


(iv) If # is any symbol (31), let Sz, Si be the convex hypersurfaces 
represented by 


(34) g—afo) = 7 =7,(%) = m(o). 


€,,=-1 
Then Sz may be represented in the form 
(35) Sr: 1 =7,(0) =7,() 


(v) The set of all symbols (31) may be made into a topological space € 
as follows: If an integer n, and n numbers «¢,°,: - -,en° equal to —1, 0, or 
+1 are given, the subset, of € formed by those symbols # for which « = «,°, 
(i=1,---,n), is said to be open in €. Also any sum of open sets in € is 
said to be open in €. It is well known that the resulting topological space is 
homeomorphic with a Cantor set (except, of course, if the number of given 
convex hypersurfaces is finite). From the assumption (10) and the definition 
(32), it follows that 7,(#) is a continuous function of the pair of variables FH, w, 


(vi) A symbol (31) is said to be of infinite length if ¢, 0 for every k. 
The subset of the topological space € consisting of all symbols of infinite 


686 E. R. VAN KAMPEN. 


length clearly is compact and the Sg corresponding to these symbols are 
subsets of the vectorial sum 7 of all S;. From the last remark in (v) it 
follows that the point set F formed by the points of all these Sz is a closed 
subset of 7. The set F, which contains the boundary of T by Section I, will 
be called the irregular set of T. The set GT — F is an open set and will 
be called the regular set of T. It may be seen from examples that G may be 
empty and that G may be dense in 7. 


(vii) If a symbol of infinite length is such that «——141 for at least 
one k, then Sz does not have a point in common with the exterior boundary 
of T. In fact for such an #, from (32) and (2), since the equality sign in 
(7) is excluded, 


so that the statement follows from the remark made in connection with (8). 


(viii) A symbol (31) is said to be of the finite length / if « ~ 0 or « =0 
according askSlork>I. Clearly if £ is of length /, then Sz is contained 
in the vectorial sum of S,,- --,S:. The irregular set F, of is formed 
by the 2'* curves Sz corresponding to symbols (31) of length 1. 


In what follows use will be made of the following assumption, which will 
be referred to as condition (*) : 


(*) The hypersurface Sz which belongs to the symbol 
(36) == (+ 1,—1,—1,—1,- - -) 
is convex and forms the interior boundary R of the vectorial sum T of the Sx. 


Concerning this condition the following remarks hold: 

(ix) If condition (*) is satisfied, then the parameter representation 
7,(») and the supporting function he(w) of F are given, in view of (30), by 
(37) 9,(o) +3m(—e); —hi(w) —3 

k>1 k>1 


(x) If condition (*) is satisfied and # is any symbol (31) for which 
«, = 1, then Sz is a convex hypersurface with the supporting function 


(38) 


In fact, by (32) and (37), the parameter representation of Sz may be brought 
into the form 


eo 


| 
| 

| 

| 

{ 

| 

| 


ON THE ADDITION OF CONVEX CURVES. 687 


Since the separate terms in (39) are representations of convex hypersurfaces 
by (3), it follows, as in connection with (11) and (12), that (39) represents 
a convex hypersurface and that the supporting function of (39) is (38). 


(xi) The following particular case of (x) is worth noting: If # is a 
symbol (31) of length /, then Sz is a convex hypersurface, moreover the hyper- 
plane of normal direction » through the point (32) of Sz is a supporting 
plane of Sz. 

(xii) If the condition in (xi) holds for every 1, then the Sz corresponding 
to any symbol / of infinite length is a convex hypersurface, even if condition 
(*) is not satisfied. This follows easily from the last statement under (v). 


(xiii) If (*) is satisfied and a symbol (31) is such that «40 for every 
k and «& = 1 for at least one k > 1, then the open convex region determined 
by the corresponding Sy contains the interior boundary of 7’. In fact if hz(w) 
is the supporting function of Sz, then by (39) 
ha(m) + ha(—o)} > 0, 


>1,€ 


since the equality sign is excluded in (7), so that the statement follows from 


the remark in connection with (8). 


III. A further consideration of the sets F and G [II (vi)]. In what 
follows use will be made of the following remark. If 7 —7:(w) is the repre- 
sentation (1) of a convex hypersurface S, and H represents a sphere of radius 
r <p and centre at the origin together with its interior, then the vectorial sum 
of 8, and H does not contain the interior of any hypersphere of radius p which 
does not meet S,. This is clear since the distance of any part of this vectorial 
sum to the nearest point of S, is less than p. 

Now suppose that (9) is a sequence of convex hypersurfaces satisfying 
(10), (17) and the conditions formulated at the beginning of Section II. 
Suppose in addition that the origin of the vector space is in the interior of each 
of the curves S;, i. e., that 


(40) (w) =hk(w) > 0 


for every & and every w, and also that the S, corresponding to (9) have 
property II, (xi) for every 1. Let V be a given component of the regular 
set G [II, (vi) ] of the vectorial sum T of the S;. Now there exists an integer 


m, which depends on V and has the following properties: 


(i) if H is a symbol (31) such that the corresponding Sy contains a 
point of the boundary V, then the « in E for which k > m are all equal. 


= 


E. R. VAN KAMPEN. 


(ii) for every n > m, the regular set G, of the vectorial sum 7, of 
S;,- °*,Sn has a component V, which contains V. Moreover V» contains 


Vnsi for every n > m. 


It will be shown first that if p > 0 is such that V contains the interior 
of a hypersphere of radius p, and m is such that 


(41) ay < 4p for every k > m, 


where a, is defined in (10), then m satisfies (i). In fact let EZ be a symbol 
(31) such that ¢,——1, «,—1, where p and q are fixed numbers larger 
than /, then the loci S,, Sz, S. determined by 


= 1,10) + —m(—o), 7 + m(—») 
are convex hypersurfaces contained in F [II, (vi) ]. Now it is clear from (7) 
and (41) that the remark at the beginning of this Section may be applied. 
This shows that a region which contains the interior of a hypersphere of 
radius p and which does not meet Sz but has a boundary point in Sz, must 
necessarily meet either S, or S_.. Thus V cannot have a boundary point on Sz 
and so an integer m which satisfies (41) has property (i). 

Next it will be shown that an integer m has not only property (i) but also 
property (ii) if the numbers a, occurring in (10) satisfy not only (41) but also 
(42) yu < 4p, 


k>m 


where p is again the radius of a sphere, the interior of which is in V. Let 2 
be any symbol (31) of length 1 > m and let £,, E_ be the symbols obtained 
from FH by replacing all « for which k >1 (which are 0 by II, (vili)) by 
+ 1,—JI respectively. If are the parameter representations (32) 
of the hypersurfaces S,, S_ which correspond to F,, F_, then it follows from 
(40), (42) and (10) that 7,(o) —y_() (which is by (32), (3) and II, (iil) 
the representation (1) of some convex hypersurface) satisfies 


(n.(w) —7_-(w)) Zp. 


It follows from the remark at the beginning of this Section that if the interior 
of some sphere of radius p does not meet S, or S_, then it does not meet Sz. 
Thus if Sz is of length 1 > m, then Sz does not meet V and accordingly the 
region V; as defined in (ii) must exist if 1 > m. Obviously Vi > Vi.1,1 > m, 
follows by the same argument, since no essential use has been made of the 
fact that the number of hypersurfaces in (9) is infinite. 

Thus a number m exists which has both property (i) and property (il). 
Of course the results of this Section hold if condition (*) is satisfied. 


688 
| 
| 
| 
| 


‘or 


0] 


ON THE ADDITION OF CONVEX CURVES. 689 


IV. The density of convolutions of distribution functions correspond- 
ing to convex curves. From now on convex curves S; in the (2, y)-plane will 


be considered. Let the cohvex curve S; be given by 


(43) Sy: = yx(8) 


where 9 is an angular parameter and where exactly one point of S, corresponds 
to each value of 6. It is well known that (43) determines a distribution 
function ¢%(#), where F is any Borel set in the (a, y)-plane, by the rule that 
d(H) is the 6 measure of the set of values of 6 for which the point (43) of 
S, is in #. Obviously S; is the spectrum of ¢x. The convolution 


(44) Yn = h2 hn, n>1 


has as its spectrum the vectorial sum of Sn. On the assumption 


that (43) has a continuous derivative and that 
(45) + for every 6, 


and finally that no S; contains a line segment, it has been proved '° that the 
convolution of at least two ¢x is absclutely continuous and that the convolution 
of at least four 4; has a continuous density. If it is also assumed that the 
functions (43) are regular analytic, then it has been shown '° that the density 
of the convolution of two ¢x is a regular analytic function of x and y on the 
regular set of the vectorial sum of the corresponding S;, (cf. (vi) in Section IT) 
and that this density is not bounded in the vicinity of any point of the irregular 
set of the vectorial sum. This statement forms the special case where n = 2 


of the following 


THEOREM 2,. Let the convex curves S, (k =1,2,-- +), in the (a, y)- 
plane be given by the regular analytic parameter representation (43) satisfying 
(45) and let the corresponding Sz have property (xi) of Section II forl <n. 
Then the density 8,(2,y) of (44) is regular analytic on the regular set Gn 
of the vectorial sum Tn of +, Sn. 


It will be assumed that the following statement has been proved : 
(§,) On the assumptions of Theorem 2,, not only the statement of the 


Theorem holds but also: If P is a common point of exactly p distinct curves Sz, 


where 7 is of length n, and Si (i—1,- --.p), denotes these curves, then a 
vicinity U of P may be found, and a regular analytic function A;(z, y) in the 
complement of S‘ in U (i=1,:- + -,p), such that for any point (z,y) in U, 


*°Van Kampen and Wintner [6], p. 103. 


690 E. R. VAN KAMPEN. 


(46) 3 9). 


Needless to say Ai(z, y) consists in general of distinct regular functions 
in the parts into which U is divided by S*. Thus (46) is to the effect that 
singularities of 8,(z,y) at the several curves Sz are added at common points 
of distinct curves Sz. In case n = 2, Theorem 2, and (§,) are correct, the 
latter by (vii) of Section II. It will be shown that (§,) implies (§n.1), thus 
completing the proof by complete induction of Theorem 2,. 

If the conditions of Theorem 2,,, are satisfied, then clearly F', consists 
of 2”-* convex analytic curves Sz, so that any singularity of F, is of the type 
described in (§,). 

Since Wns = Wn * One has for the density y) of Wns 


over the (,7)-plane, or, by the definition of ¢n.:(£), 


over all angles @. For a fixed (2, y)) let P,- - -, Pq be the distinct common 
points of F,, and the curve 


(48) = Ly — Ins (9), Y = Yo — (9) 


and let J,,: - -,Jq be non-overlapping 6-intervals containing P,,- - Pq such 
that the arcs of (48) corresponding to these intervals are contained in the 
vicinities corresponding to P,,- - -,Pq by (§n). Clearly the contribution to 
(47) of the 6-intervals, obtained by omitting J,,- - -, J from the range of 6, 
is regular in a vicinity of (2, y)). Suppose that (48) is not tangent to the 
curve S‘ of F, which passes through P = P;. Then the contribution 


(49) di — (8); ¥ — (8) 


to (47) obviously is regular in a vicinity of (2, yo) in view of (§,). Now 
suppose that (48), is tangent to the curve S‘ of F, at P= Pj. Then clearly 
(49) is regular in a vicinity of (Zo, yo) except at those points z, y of this 
vicinity for which the curve corresponding to (48) still is tangent to S*. 
Thus (§n,:) will follow if it is shown that to a point of tangency P; of (48) 
and S* there corresponds a point of one of the curves of Fy,, at (2, yo). Now, 
if w’ is the value of w corresponding to the point P; of S‘, then the corre- 


t 
i=1 
| 
| 
| 


ions 
that 
ints 
the 
hus 


ists 
ype 


ON THE ADDITION OF CONVEX CURVES. 691 


sponding point of S,,, belongs to w’ or to —o’, so that the statement follows 
from the definition (32) of the curves Sz constituting Fy,:. 


remark. In Theorem 2, it may be allowed that one or more of the curves 
Sg occurring degenerates into a single point. In that case “a curve is tangent 
to Sz” should be read “the curve passes through the point into which Sz 
degenerates.” 

Now let the convex curves S;, given in the z-plane be infinite in number 
and let (10) be satisfied. Then the infinite convolution 


(50) = lim Yn = * hy 
is convergent and has as its spectrum the vectorial sum 7 of the S;, while 
(51) =lim 6, (2, y), 
where 5(z, y) is the density of (50). 

THEOREM 3. Let the curves S; be given by the regular analytic repre- 
sentations (43) satisfying (45) and let the corresponding Sp have property 


(xi) of Section II for every 1. Then the density (51) of (50) is regular 
analytic on the regular set G of the vectorial sum T of the Sx. 


In fact, let (xo, yo) be any point of G, let 3p > 0 be less than the distance 
from (2, Yo) to the irregular set in 7 and let N be an integer such that 


CO 
An < P> 


n=N+1 
where the a, are the numbers occurring in (10). Then clearly the circular 
disk with radius 2p and centre at (2, yo) belongs to the regular set Gy of Ty 


and the spectrum of 


(52) WN = * 
is within a distance p from the origin. If 8V(2,y) is the density of y¥, then 
(53) 3(2,9) y—n) 8% (6 9) de dy 

-0O -00 


since 6 wy *y¥. This integral need only be taken over the spectrum of y, 
since 8V(é,7) —0 unless (é,7) is in this spectrum. On the other hand if 
(z,y) is within a distance p of (2, yo) and (€,7) is in the spectrum of ¥, 
then (2 —é, y—vn) is within a circle of radius 2p and centre at (20, Yo). 


Tbid., pp. 184-186. 


| 


692 E. R. VAN KAMPEN. 


Thus if (z,y) is in a circle of radius p and centre at (Zo, yo), then the in- f 
tegrand of (53) is a regular analytic function of (2, y), so that 8(2,y) isa 


regular analytic function of (z,y). This completes the proof of Theorem 3, 


Clearly in Theorem 3 the condition that II, (xi) is satisfied for every | 


may be replaced by the condition that the S;, satisfy the condition (*) intro- 


duced in IT. 


V. The logarithm of the Riemann zeta function. If pm denotes the 
m-th prime number it is known '* that for a fixed « > 1, the asymptotic dis- 


tribution function of the almost periodic function 
(54) fo(t) —— log £(o + it) + 4 log ¢(20) 
= [log (1 — — log (1 — pm?) 


m=1 


may be represented as the infinite convolution 


where $m? denotes the distribution function which belongs to the convex curve 
(43) defined by 


(55) ty = log (1— — 4 log (1 — 


or in other words by 


(56) = Lm? pm), = Yn? = 8(9, 
where 
1 — 2p cos 
p sin 6 


y = p) 
is the equation of a system of curves Sp depending continuously on a parameter 
p, it being understood that the arc tan takes only values between — 4 and 
+47. For 0 <p <1, Sp represents a convex analytic curve satisfying (45) 
and having both axes = 0, y = 0 as lines of symmetry. 

Let h(p,w) denote the supporting function of Sp, where now w denotes 
not as in I, II, II], a variable unit vector in the (2, y)-plane, but the angle 
between such a vector and the positive z-axis. For reasons of symmetry it is 
sufficient to consider only the interval Oo 4. Clearly 


(58) h(p, 0) = log [(1 + pm?) /(1 — pm) J, 
h(p, 47) = arc sin 4r). 


72Van Kampen and Wintner [6], Section 7, where further references are given. 


| 
{ 
| 
{ 
| 
| 
| 


the 


ON THE ADDITION OF CONVEX CURVES. 693 


Now let pi, p2,: be a sequence of numbers such that p; <1 and 
0 < pair < pn and that 3 h(pn,) converges uniformly in w. It will be shown 
that the function 
(59) f(o) = h(p1, 
has the following properties: 
(60 a) f(w) > 0 for every w, in case h(p1, $7) > Sh(pn, $7), 

(60 b) f(w) < 0 for every w, in case h(p:,0) < Sh(pn, 9), 
N>2 


(60 c) f (oo) = 0, for anw = wo, f (w) >d0if0S0< wo, f (w) < Oif 4x, 


in case h(p,, $7) = SA(pn, $7) and h(pi,0)= Sh(pn, 0). 
n>2 n>2 


Clearly (60a), (60b), (60c) follow from the following statement 


01) h (ps, 
h (pi, h (po, w2) 
For the proof of (61) use will be made of the following properties of the 


(61) if 0< pi < <1, 0S, < 


curves S?; 

Let Sp denote the curve similar to S? with respect to the origin in ratio 
1/(arc sinr), so that, by (58), the curves Sp have the points (0,+1) in 
common for 0<p<1. Then Sp, is, except for those two points, in the 
interior of Sp,, whenever 0 < pi < po <1. Moreover, if a(p,s) denotes, for 
0<s <1, the angle between the line ys and the normal of Sp at the 
intersection of this line and Sp, then a(p,s) is an increasing function of p 
n0<p<l. 

In proving (61) it is clearly admissible to replace h(p,w) by the sup- 
porting function of any convex curve 8’, similar to 8? with respect to the 
origin. On choosing, for fixed values p2 (pi: < pz) and the curves in 
such a way that h(p:,0:) = h(ps, w2), it is clear from the geometrical proper- 
ties of Sp mentioned above that h(p2, < h(p:, 01) if > o; is sufficiently 
near to o;. Thus h(p:,0)/h(p2,0) is an increasing function of wo if 
0< pi < po <1 and (61) is proved. 

Let o*, o;, denote, for any positive integer k, the obviously unique numbers 
such that 
(62 a) are sin = arc sin pm? if o—o* > 1 

m>k 
(62b) [(1 me*)/(1— ] 
log [(1 + pm?)/(1— pm?) ] if > 1 


m>k 


** Bohr and Jessen [1]. In this paper (61) is proved for the case w, = 3m. The 
general proof given below is an extension of the proof given there. 


17 


> in. 
isa 
ry | 
tro- 
ve 
| 


694 E. R. VAN KAMPEN. 


and let & denote the number such that 


(62 c) pm’, 
m>1 
Then it is clear from (60 a), (60 b), (60 c) that o* > ox for every k > 0, while 
the numerical values ot = 1.764... and ¢ = 1.778... show that o, < o' <@. 
Clearly for sufficiently large k, o* > o% > G. 
Consider now the case ** where for some integer 1, 


(63) o > Max -, 0). 


Let H; and Ey; be the symbols (31) obtained from a given symbol of length 
1 — 1 by replacing e: by + 1, « by > 1, and by —1,e&by +1,k >], 
respectively. Let Sz, Sz be the corresponding curves (32), 4 (), 7,,(w) their 
parametric representations and h;(w), hi(w) their supporting functions. It 
is known ?* that, whenever o = G, the vectorial sum of the S»*% satisfies con- 
dition (*), so that, by II (x), the curves Sg corresponding to the Sn are 
convex curves and Theorem 3 is applicable. Now the functions hy, hz; satisfy, 
by (32) and the definition of h(p,), the equality 


(64) —hur(w) —= 2h (prt, 0) —2 pw, 0). 


Thus it follows from (60a), (58) and the definition (62a) of o* that 
hi(w) —hy(w) > 0. Hence 8; and Sy are exterior and interior boundary 
of a ringshaped subset of the vectorial sum of all S,,°. In a similar way it may 
be shown that in view! of (63) no Sz corresponding to a symbol F of infinite 
length does enter the ring shaped region determined by S; and Sz. Thus this 
ring shaped region is by Theorem 3 a region of regular analyticity for the 
density of 

A region of regular analyticity as found above does not disappear abruptly 
if o does not satisfy (63) but is very near to values satisfying (63). In fact, 
since all curves Sz depend on o in a uniformly continuous way, if a is restricted 
to a bounded interval with positive endpoints, it is clear that the ring shaped 
region described above remains ring shaped for certain o if ¢ > Max (a,<') 
remains satisfied, but « becomes less than o* for some k —1,-- -,]—1.” 
It will be shown below that the ring shaped region constructed above is replaced 
by a pair of crescent shaped regions if oa satisfies 


14 This case has already been considered by van Kampen and Wintner [6], Section 7. 

15 Bohr and Jessen [1]; Kershner [5]. 

16 Tt seems to require elaborate numerical calculations to decide whether or not the 
sequences {7}, {%} are monotone. 


? 
j 
i, 


hile 


OU. 


ON THE ADDITION OF CONVEX CURVES. 695 


(65) «> Max (, +, 0°", 7) 

in fact, if (65) is satisfied, then the difference (64) of hr(w) and hyz(w) is a 
function satisfying (60 c), so that the curves S; and Sz; determine four crescent 
shaped regions, two of which are halved by the z-axis and two of which are 
halved by the y-axis. It is clear from (60c) that the first two regions have 
§; as exterior and Sy; as interior boundary, while the opposite is true for the 
last two regions. Since S; is in the interior of the curves Sz corresponding to 
symbols # of infinite length adjacent to H;, and Sy, is exterior to the curves 
Sp corresponding to symbols # of infinite length adjacent ** to Hy, it is clear 
that the last two regions are not contained in the regular set of the vectorial 
sum of all S»%. On the other hand, it may be inferred easily from (65) that 
no Sz corresponding to symbols # of infinite length has a point in common 
with the two crescent shaped regions which are halved by the z-axis. Thus 
the latter two regions are regions of regular analyticity of the density of $7 
by Theorem 3. It is clear that the two regions just described decrease in size 
as o decreases from a! to o; and disappear completely when o becomes equal 
toa;. The two regions may, of course, cease to be regions of regular analyticity 
of the density of ¢% for some o > a if for instance o'! > a1, a case which 
probably occurs for sufficiently large 1. 


THE JOHNS HOPKINS UNIVERSITY. 


REFERENCES. 


[1] H. Bohr and B. Jessen, “ On the distribution of the values of the Riemann zeta 
function,” American Journal of Mathematics, vol. 58 (1936), pp. 35-44. 

[2] E. K. Haviland, “ On the addition of convex curves in Bohr’s theory of Dirichlet 
series,” ibid., vol. 55 (1933), pp. 332-334. 

{3] B. Jessen and A. Wintner, “ Distribution functions and the Riemann zeta func- 
tion,” Transactions of the American Mathematical Society, vol. 38 (1935), pp. 48-88. 

[4] R. B. Kershner, “ On the addition of convex curves,” American Journal of 


Mathematics, vol. 58 (1936), pp. 737-746. 
[5] R. B. Kershner, “On the values of the Riemann zeta function on fixed lines 


o> 1,” ibid., vol. 59 (1937), pp. 171-178. 
(6] E. R. van Kampen and A. Wintner, “ Convolutions of distribution functions 


on convex curves and the Riemann zeta function,” ibid., vol. 59 (1937), pp. 175-204. 


17 For the meaning of words like “adjacent ” as applied to symbols (31) compare 


II, (v). 


| 
oth 
> l, 
eir 
It 
yn- 
ire 
y; 
at 
y 
e 


ON THE PARTIAL SUMS OF CERTAIN FOURIER SERIES.* 


By Orto SzAsz. 


1. Suppose f(z) is real-valued, periodic of period 27, and Lebesgue 
integrable over (—-z,7). Denote its Fourier series by 


(1) f(2) ~a/2+ (av cos vz + by sin vz), 


and let its partial sums be 


= Ao/2, Sn = 8n(L) = + (a cos ve + bysin vx), (n= 1,2,--°-). 
p=1 


In 1932, Paley [3] proved the two theorems which follow: 


I. Suppose 
(2) | f(x)| SM in (—z,7z), 
(3) = 0, = 0, 0, (n == 1,2,---), 
then 
(4) | sn(z)| 10M, (n=0,1,2,---). 


II. If in addition f(x) is continuous, then tts Fourier serves (1) con- 
verges to f(x) uniformly for all x. 


For Theorem I Fejér [1] gave a simpler proof and replaced the constant 
10 by 4, so that 
(5) S 4M. 


I give in (§ 2) another elementary proof for both theorems simultaneously. 
At the same time I replace (5) by (§§ 3-4) 


(6) | 8n(z)| S M(2 + @/sin? a) < 3.38M, 


where a is the unique root of the equation 2a—tana in the interval 
0<a<c 2/2. 
If B is the least number such that (2) and (3) imply 


| sn(z)| S BM, (n= 


* Received April 26, 1937. 
696 


F 
f 
| 
y=1 
| n 
li 


ON THE PARTIAL SUMS OF CERTAIN FOURIER SERIES. 697 


then, by (6), 
BS2-+ a/sin? « < 3.38. 


On the other hand if f(z) =x—2/2,0 << a, f(—z2z) =—f (a), then 
© sin vx 


ue sin x 


dz =1,.851---:; 


M = 7/2, and + 1) tf = 


sin 
~~ J0 


Thus 8 lies somewhere between 1.17 and 3. 38. 
In §§ 5-6, I consider generalized trigonometric series, improving on a 
theorem by M. Fekete [2]. 


hence 


2. On putting 


we have 


+ c0s vz 


= + — sin vr 


p=1 V 


= +35 (1— cos va) =} a2? +232 sin? 


if d, = 0, (n=1,2,:--). We now assume in addition, instead of (2), 


= M in (—z7, 


Hence, a fortiori, S 2?M/2 in 0 << <7, and 


(7) <M, (m= 1,2,3,° °°). 


Letting now x | 0, (7) takes the form 


i 
| 


698 OTTO SZASZ. 


Hence 3 ay converges, and }aycosvz converges uniformly in any interval. 


Moreover, by (8), 


n 

a 
> a = M——; whence Sw | |< 
2 pal 2 
and 


(9) cos + ay | cos va | = M. 
v=1 1 


Furthermore, let =—M; then from J $(«) dz it follows 
that —2M=a,=2M. Also by (8) 


(10) day + ay cos = — Sa, > 
1 1 
On putting 


we have 
~ > bysin va, 
n “4 2 
1 


Since sin u/u is monotonic decreasing in 0 < u < 2/2, this becomes for = x/n 


4 n 
(n = 1, 2,3,---), 
or 
9 n 
(11) y,(2), (n= 1,2,3,- >), 


If now ¥(z) = M in |x| <a, and a fortiori M, 0 < < then 

(11) leads to 

(n = 1, 2, 3, ). 


If, in addition, y(x) is continuous at « = 0, then y,(z)/x—> 0 for xJ 0, and 
from (11) we derive 


(13) > vby = o(n), ©, 
1 
Using the identity 


(14) 


| 
j 
i 
if 
U0, => w, (n = 2,3,° °°), 
1 1 1 


V8 


ON THE PARTIAL SUMS OF CERTAIN FOURIER SERIES. 699 
where we put u,b, sin nz, Fejér’s theorem on the arithmetic means and 
(12) lead to 
(15) bvsinve | <M 

1 
(9), (10) and (15) yield Theorem I with the smaller constant 2 -++ 2/2 instead 
of 4. in the inequality (5). 
Moreover, if f(x) is continuous throughout, Theorem II follows from 
(13) and (14). 


3. We shall generalize the assumptions on ¢(x) and w(x), and improve 
upon the constant in (12). We first prove 


THEOREM 1. Let av=0, (v=0,1,2,---), and < then 
1 


2 
exists for x > 0. Suppose moreover 
lim inf WM, 


then 


da, M, | + dv cos ve | M, la| <2, (n = 1,2,---), 


and % dy cos ve converges untformly on the real axis. 


For the proof, note that (16) implies 


n 2 
(n= 
2 1 


let 2, | 0 and lim R, (2%) = M; then 
k-00 


n 


lim > +> <= lim R, (2%) = M. 


Hence, 


Ao 


1 
The theorem follows at once from this. 


CO 
If in particular ¢(x2) ~ $a + 2 dy cos ve, then 


| 

|| 

n 


700 


OTTO SZASZ. 
and we have the 


Corotuary. If 


ie. 
~ $40 + av cos vz, ay = 0, (y= 0,1,2,° °°), 


and 


ie oo 
then Sav < 
1 1 


We call u +> ae he the Riemannian mean of the first kind corre- 
1 


co 
sponding to the series > wy. 
1 


4. We next prove 


THEOREM 2. Let bv=0, (v—1,2,3,---), and Siv'by < then 


(17) 


exists for x>0. 
0<2=8; then 


Suppose, in addition, in an interval 


(18) 
1 


/ 
sin? <1.38Mn for n 


where « is the unique root of the equation 2a =tana in0 << a< 2/2. 


We have from (17) 


n 2 2n 
R.(2) = = > Qa (= vb, 


T 1 


for 0 < nz = 7/2; or 


nz sin? 


n= 
1 


2 sin? nx 
where is the minimum of ¢/sin? ¢ in 0 <t < 7/2; @ is easily found 
to be the unique root of the equation 2¢ = tana in0 < #< 2/2. The “ Table 
of Functions” by Jahnke and Emde (2nd edition, 1933, p. 33) gives 
1.16 1.17 =a, and sin a,/a, 0.7870---. Then by a simple 
calculation 

< 
sin? @ ~ sin? a, 


< 1.38. 


This proves the theorem. 


4 ‘ 
| | 
22 sin va \? 
1 


re- 


ON THE PARTIAL SUMS OF CERTAIN FOURIER SERIES. 701 


If in particular y(r) ~ > by sin va and | ¥(x)| S M, then relations (18) 
z 


and (14), with w, = by sin va, give 


I <y 
| 2 by sin vx | = M < 2.38M. 


Furthermore 


— singve\? x 


and, from (19), 


0 


nT sin?a 


in particular, if ~f y(t)dt +0 as x 0, then 
0 


= vby— 0 for n— o. 


“Ms 


ys the Riemannian mean of the second kind 
vx 


We call 0+ 
n 

corresponding to the sequence {s,}, or to the series > uy with > wy = Sn. 
0 0 


5. We now pass to generalized trigonometric series and to almost periodic 
functions. The most general result in the case of positive coefficients is due 
to M. Fekete [2]. 

Let ¢(x) be a measurable real function of the real variable z, 


sup f 


and let 6(— x) =¢(x). Suppose that ¢(z) S U, and that 


(20) lim =f cos At dt = a(A) = M{¢(t) cos At} 


@->0o 


exists for all X=0. It is easy to show that a(A) vanishes, except at an 
enumerable set of A-values. Denote, in a certain order, by Ai, A2,* * * those A 
for which a(A) 0; we call the series 


(21) COS Ant ~ 


where a, = a(0), dn = 2a(An), in the Fourier expansion of 


| 

Jo 1 

| 

1 


OTTO SZASZ. 


Let p:,p2," * * denote a subsequence of {An}, consisting of linearly in- 


dependent numbers; that is, no equation of the form ¥ rvpy = 0 holds, where 
1 


the ry are rational and not all zero. Denote by {yn} the subsequence of those 
An, Which can be represented in the form 


h 
An = vpv, h=h(n); ry =Ty(n) rational. 


Let 0 << w < 2 denote given positive numbers. With this notation (with the 
restriction | #(x)| =U) Fekete proved the theorem: 


Suppose that 
= 0 for OS pn <0; 


then the “ partial sum ” 


(22) Ao + 2 Sa(un) COs o<a 


Un <w 


of the Fourier expansion (21) converges absolutely in — 0 <x < 0; its sum 
function $.(x), an almost periodic function whose Fourier expansion is identi- 
cal with (22), satisfies the inequality 

U 


(23) | | 


For the proof, we associate with (21) the “ Fejér-polynomials ” S.(z), 
defined, for g = 1, by 


¥4=-P vo=-P vq=-P 
where Q=—q!, P=qQ=q-'q!. If the sequence (py) consists of go terms 
only, we put py = 0 forv > q. By virtue of (20) and the linear independence 
of the pv’s, we can write (24) in the form 


(25) = ao + ken an 008 Anz,’ 
where 
g q 
(26) == (1 — Ls!) , whenever 0 << A, = 2 


the vq being integers with | va | < P, while &,“” = 0 for other values. Obviously 
0=k,@ <1. We shall prove 


(27) lim —1, whenever An C {py}. 


702 
| 
| 


he 


In fact, we then have 


Hence, for 


where va = for 1S Consequently, from (26), 


= 


Finally we prove 


(28) 


This follows from 


g(t) 


where 


From this we see that 


S UM{3[Ky(t + x) + Ky(t—2z)]} =U 


and, in particular, 


ON THE PARTIAL SUMS OF CERTAIN FOURIER SERIES. 


Ug and vq > 0, integers. 


> max (@,° 


ay, for q> ©. 
=U. 


+ exp(—i(t+ 2) vapa/@) + exp(1(t —- x) > vapa/@) 


+ exp(—i(t—z2) 
= + 2) + Kq(t—z)]}, 


“exp (it vapa/Q) 


Da(0) =a + an S U. 


We next define for any positive » the “ Riesz-mean ” 


q) = + 


) An COS 


703 
lere h h 
ose a=1 a=1 Va 
v O O 
h 
|| 
a=1 
|_| 
| 
-P,P 
K,(t) b> 
rl 
=II 
a=1 vq=-P 
| 


704 OTTO SZASZ. 


associated with the polynomial (25). Applying the formulas 


sin? _f1—k/f ff 
(29) = f cos os 2kt at—{ 


and using (25), we obtain 


sin? 


Zale + 2) + at 


From this and (28) we find 
Ry(2,q) <= 


sin? dian 


and, in particular, 


Since every term is positive if » < Q, we get for an arbitrary NV 
n<N 


O<An<u 


Now passing to the limit q— «, 


(1—") a(um) <0, M—M(N), 


Um 


and, a fortiori for 0 < Q, 


m<M 


Here 1 — pm/p > 1— ow /p; hence, 


m<M 
+2(1—) x 4(um) SU, 
and for »> 2 
m<M 


<w 


As N— o, we have M —> o, and 


U 


Um <w 


This gives us the inequality (23). Essentially this is Fekete’s proof. 


6. We now consider odd functions y(—z) =—y(z), and the corre- 
sponding generalized sine-series. We assume again that 


al 
b 
( 


ON THE PARTIAL SUMS OF CERTAIN FOURIER SERIES. 


sup =f < @ 
0 
and that 
lim= f“y(t) sin At dt b(A) M{w(t) sin At} 


200 
exists for all A>0. Then, again, b(A) vanishes, except at an enumerable 
set of A-values. We denote in a certain order by Ai, A2,° - - those A for which 
b(A) 0, and call the series 


(32) bn sin Ant where by = 2b(An), 


the Fourier expansion of y(z). We denote by pi, p2,° - - a subsequence of {An}, 
consisting of linearly independent numbers, and by {yn} the subsequence of 
those An, which can be represented in the form 


h 
An =D rvpv, h=h(n), where ry=—rv(n) are rational. 


Again 0 < » < Q, and we suppose 
b(yn) 20 for 0< pn <Q. 


If we associate with (32) the polynomials 


ys... 
= M y(t) (1 P ) 


vq=-P 


then by virtue of (31) we can write 


T q(x) = > hen, sin Anz, 


where 
| Va | 
1 whenever 0 < A, ==) vapa 
a=1 Q a=1 
and k,‘? — 0 for all other cases. We now have 


= M{y(t)$[Ka(t—2) + Ka(t +2) ]), 


and on assuming 

ly(t)|SU, t>0 
we get 
(33) |Ta(z)| SU, 


Again, introducing the polynomials 


705 
1 

p=1 


OTTO SZASZ. 


(1 kn? by SIN Anz, w<Q, 
0 < AnSw 


and applying formulas (29) we obtain 


sin? 


Su(a,q) f + 2) 


From this and (33) follows 

(34) | Su(a,q)| SU, z> 0. 

At this point we simplify Fekete’s argument, replacing a theorem of S. Bern- 
stein by the following 


LemMA. Given a generalized sine-polynomial 


S(z) = > by sin Aya, dy > 0, 


assume 
by > 0, (v=—1,2,---,n), and SU for 0<2r<é. 
Then 


20 
sin? oU < 1.380U, for o = 


where « 1s defined in Theorem 2. 


For the proof consider 
— C08 Avr sin \? 


from which we find 


in 2 
do? (5 S(t)dt Sav. 


Since wx S 7, 


=> 
dor 


pw 


Finally, choosing 7 = 2a/w < 2/w, we have 


Avby S sin? U, for é. 


Applying this lemma to the polynomial S,(2z, q), we get, from (34), 


706 
al 
| 
and 


Ne 


ON THE PARTIAL SUMS OF CERTAIN FOURIER SERIES. 


0< sin? 


and passing to the limit > 


2 > (1-22) pnd (um) wlll. 


0< sin” & 


But 1 — pm/p = 1 —w/p, hence 


2 pmb(um) S 


0<UmSw 


and, as »—>Q, this takes the form 


(36) 


which is sharper than Fekete’s inequality 


QU 
2 (pm) = 
0< UmSw 


if we restrict ourselves to w —z~- < 2 
sin? 


From (36) it follows that 


2 
sin? @ 
dy sin Ant | S 


and from (34), for p» —o, 
€ —*:) bn sin Anz | SU. 


0 < AnSw 


Thus, writing pm for kn©@ if An = pm, 


2 | Pm‘? (pm) sin Pon | 
sin? a( ) 


0< Um Sw 1 
Q 


From (36) follows, as Fekete proved, that > 0(pm) sin pm&% converges 


0< UmSw 
absolutely, hence by (27) 


(37) lim2 pm@b(pm) sin pnt =2 B(pm) sin pnt = Ya(Z), 


(38) ldo(x)| SU 
sin? a(1 


707 
sin* 
. 
Sa 
| 
if 


708 OTTO SZASZ. 


If {px} is a basic sequence of {Ax}, it follows that 
Yo(t) = basinrmt, <Q, 


0 < AnSw 


and this series converges absolutely. If, in particular, all a, =0 and all 
b, = 0, then © is arbitrary, and, letting Q > o, from (22) and (30) we find 


0 < AnSw 


Hence for any real function f(x) which satisfies 


| f(z)| SU, 
M{f(t)e™*} = C(An) = 3(dn— thn), 
we have 
i 
| + (dn cos nz + b, sin nx)| =U [2 | 
The fact that the functions ¢.(z), ~.(z) are u.a.p. and that their Fourier 
expansions are (17) and (29) respectively follows in exactly the same way as 


in Fekete’s theorem. 
Analogous argument can be applied to derive corresponding theorems for 


trigonometric integrals (cf. Sz4sz [5]) and also to improve upon some results 
concerning Fourier series with na, =—K, nbn =—K and more general 


classes (cf. Szdsz [4]). 


UNIVERSITY OF CINCINNATI. 


REFERENCES. 


[1] L. Fejér, “On a theorem of Paley,” Bulletin of the American Mathematical 
Society, vol. 40 (1934), pp. 469-475. 

[2] M. Fekete, “On generalized Fourier series with non-negative coefficients,” 
Proceedings of the London Mathematical Society, ser. 2, vol. 29 (1935), pp. 321-333. 

[3] R. E. A. C. Paley, “On Fourier series with positive coefficients,” Journal 
of the London Mathematical Society, vol. 7 (1932), pp. 138-144. 

[4] O. Sz4sz, “Convergence properties of Fourier series,” Transactions of the 
American Mathematical Society, vol. 37 (1935), pp. 483-500. 

[5] , “Uber die Partialsummen Fourierscher Reihen,” Mathematischer und 
Naturwissenschaftlicher Anzeiger der Ungarischen Akademie der Wissenschaften, 1937. 


In print. 


